04: Creating a CF-NetCDF file#

In this session we will create a basic NetCDF file that compliant with the Attribute Convention for Data Discovery (ACDD) and Climate and Forecast (CF) convention.

Firstly, let’s import the libraries that we will work with.

if (!requireNamespace("RNetCDF", quietly = TRUE)) {
  install.packages("RNetCDF")
}
library(RNetCDF)

Initialising your file#

Let’s first create an empty object that we are going to use.

ncds <- create.nc("../data/exported_from_notebooks/test.nc")
print.nc(ncds)
netcdf classic {
}

Dimensions and coordinate variables#

Dimensions define the shape of your data. Variables (your data) can be assigned one or more dimensions. A dimension in most cases is a spatial or temporal dimension (e.g. time, depth, latitude, longitude) but could also be something else (e.g. iteration, number of vertices for data representative of cells).

Dimensions tell you how many points you have for each coordinate. Coordinate variables tell you what the values for those points are.

Let’s imagine a few simple scenarios. I’ll initialise a new NetCDF dataset each time.

1 dimension - depth#

ncds <- create.nc("../data/exported_from_notebooks/empty.nc")
depths <- c(0,10,20,30,50,100)
num_depths = length(depths)

dim.def.nc(ncds,"depth",num_depths)
print.nc(ncds)
netcdf classic {
dimensions:
	depth = 6 ;
}

You then need to add a coordinate variable (I’ll again call it depth) which has a dimension of depth. It is quite common for the dimension and coordinate variable to have the same name.

First we define the variable. Below the first argument ncds is my NetCDF file, depth is the name I am giving to the dimension, NC_INT means the values will be integers, and the final argument depth says that this variable has one dimension call depth.

var.def.nc(ncds,"depth","NC_INT","depth")
print.nc(ncds)
netcdf classic {
dimensions:
	depth = 6 ;
variables:
	NC_INT depth(depth) ;
}

Only now can we add our values to the variables.

var.put.nc(ncds,"depth", depths)
print.nc(ncds)
netcdf classic {
dimensions:
	depth = 6 ;
variables:
	NC_INT depth(depth) ;
}

A key feature of a NetCDF file is that there is a defined structure so your data and metadata will always be in the same place within the file. This makes it easier for a machine to read it. We will add more types of data and metadata as we go, but first a few more examples.

A time series of data#

I’ll create a list of timestamps for myself first.

timestamps <- list(
  as.POSIXct("2023-06-18 00:00:00", tz = "UTC"),
  as.POSIXct("2023-06-18 03:00:00", tz = "UTC"),
  as.POSIXct("2023-06-18 06:00:00", tz = "UTC"),
  as.POSIXct("2023-06-18 09:00:00", tz = "UTC"),
  as.POSIXct("2023-06-18 12:00:00", tz = "UTC"),
  as.POSIXct("2023-06-18 15:00:00", tz = "UTC"),
  as.POSIXct("2023-06-18 18:00:00", tz = "UTC"),
  as.POSIXct("2023-06-18 21:00:00", tz = "UTC")
)

print(timestamps)
[[1]]
[1] "2023-06-18 UTC"

[[2]]
[1] "2023-06-18 03:00:00 UTC"

[[3]]
[1] "2023-06-18 06:00:00 UTC"

[[4]]
[1] "2023-06-18 09:00:00 UTC"

[[5]]
[1] "2023-06-18 12:00:00 UTC"

[[6]]
[1] "2023-06-18 15:00:00 UTC"

[[7]]
[1] "2023-06-18 18:00:00 UTC"

[[8]]
[1] "2023-06-18 21:00:00 UTC"

There are specific recommendations on how time should be stored in NetCDF-CF files. I will try to explain briefly here, and there is a nice explanation here too: https://www.unidata.ucar.edu/software/netcdf/time/recs.html

It is most common to have a dimension named “time” as well as a coordinate variable with the same name. Let’s discuss the variable first.

The “time” variable has units that count from a user defined origin, for example “hours since 2020-01-01 00:00 UTC” or “days since 2014-01-01”. The units may be in years, days, seconds, nanoseconds, etc. Whilst this approach may seem strange at a glance, it allows the times to be stored in conventional numerical formats such as integers or floats, and to our desired precision. This is much more efficient than using a long timestamp string for each coordinate.

Some softwares (e.g. xarray in Python, Panoply) know how to interpret this and will convert the data into timestamps in when you extract the data from a CF-NetCDF file. Unfortunately, at the time of writing, RNetCDF can not do this.

# Calculate the time differences in hours since the first timestamp
time_diff_hours <- sapply(timestamps, function(ts) as.integer(difftime(ts, timestamps[[1]], units = "hours")))
print(time_diff_hours)
[1]  0  3  6  9 12 15 18 21
num_times = length(time_diff_hours)

ncds <- create.nc("../data/exported_from_notebooks/1d.nc")
dim.def.nc(ncds,"time",num_times)
var.def.nc(ncds,"time","NC_INT","time")
var.put.nc(ncds,"time", time_diff_hours)
print.nc(ncds)
netcdf classic {
dimensions:
	time = 8 ;
variables:
	NC_INT time(time) ;
}

Multiple dimensions#

Now let’s create a NetCDF file with multiple dimensions.

ncds <- create.nc("../data/exported_from_notebooks/3d.nc")
depths <- c(0,10,20,30,50,100)
latitudes <- c(78.5271,79.2316,80.3261)
longitudes <- c(30.1515,28.5810)

dim.def.nc(ncds,"depth",length(depths))
dim.def.nc(ncds,"latitude",length(latitudes))
dim.def.nc(ncds,"longitude",length(longitudes))

var.def.nc(ncds,"depth","NC_INT","depth")
var.def.nc(ncds,"latitude","NC_DOUBLE","latitude") # Values have decimal places, so NC_DOUBLE
var.def.nc(ncds,"longitude","NC_DOUBLE","longitude") # Values have decimal places, so NC_DOUBLE

var.put.nc(ncds, "depth", depths)
var.put.nc(ncds, "latitude", latitudes)
var.put.nc(ncds, "longitude", longitudes)
print.nc(ncds)
netcdf classic {
dimensions:
	depth = 6 ;
	latitude = 3 ;
	longitude = 2 ;
variables:
	NC_INT depth(depth) ;
	NC_DOUBLE latitude(latitude) ;
	NC_DOUBLE longitude(longitude) ;
}

Data Variables#

Now let’s add some data variables. Starting from the NetCDF file created directly above that has multiple dimensions.

You can choose what name you assign for each variable. This is not standardised, but be sensible and clear. I will show you how to make your data variables conform to the CF conventions using variable attributes in the next section.

1D variable#

depths <- c(0,10,20,30,50,100)
chlorophyll_a <- c(21.5, 18.5, 17.6, 16.8, 15.2, 14.8) # Must be same length as the dimension

ncds <- create.nc("../data/exported_from_notebooks/1d_chla.nc")

# Dimension and coordinate variable
dim.def.nc(ncds,"depth",length(depths))
var.def.nc(ncds,"depth","NC_INT","depth")
var.put.nc(ncds, "depth", depths)

# Data variable with 1 dimension
var.def.nc(ncds,"chlorophyll_a", "NC_DOUBLE", "depth")
var.put.nc(ncds,"chlorophyll_a", chlorophyll_a)
print.nc(ncds)
print(var.get.nc(ncds,"chlorophyll_a"))
netcdf classic {
dimensions:
	depth = 6 ;
variables:
	NC_INT depth(depth) ;
	NC_DOUBLE chlorophyll_a(depth) ;
}
[1] 21.5 18.5 17.6 16.8 15.2 14.8

2D variable#

Now a 2D variable, e.g. a grid of longitude and latitudes

latitudes <- c(78.5271,79.2316,80.3261)
longitudes <- c(30.1515,28.5810)

# Create random wind speed values
wind_speed <- runif(length(latitudes) * length(longitudes), min = 0, max = 10)
# Reshape the wind speed values to match the latitude and longitude dimensions
wind_speed <- array(wind_speed, dim = c(length(latitudes), length(longitudes)))
print(wind_speed)
            [,1]     [,2]
[1,] 0.006746668 7.072877
[2,] 2.427205718 7.512966
[3,] 3.995843008 8.875598
ncds <- create.nc("../data/exported_from_notebooks/2d_wind_speed.nc")

# Dimensions and coordinate variables
dim.def.nc(ncds,"latitude",length(latitudes))
dim.def.nc(ncds,"longitude",length(longitudes))
var.def.nc(ncds,"latitude","NC_DOUBLE","latitude")
var.def.nc(ncds,"longitude","NC_DOUBLE","longitude")
var.put.nc(ncds, "latitude", latitudes)
var.put.nc(ncds, "longitude", longitudes)

# Data variable with 2 dimensions
var.def.nc(ncds, "wind_speed", "NC_DOUBLE", c("latitude", "longitude"))
var.put.nc(ncds, "wind_speed", wind_speed)
print.nc(ncds)
print(var.get.nc(ncds, "wind_speed"))
netcdf classic {
dimensions:
	latitude = 3 ;
	longitude = 2 ;
variables:
	NC_DOUBLE latitude(latitude) ;
	NC_DOUBLE longitude(longitude) ;
	NC_DOUBLE wind_speed(latitude, longitude) ;
}
            [,1]     [,2]
[1,] 0.006746668 7.072877
[2,] 2.427205718 7.512966
[3,] 3.995843008 8.875598

Now you can see that the wind_speed variable has two dimensions; latitude and longitude. This is another major advantage of NetCDF files over tabular data formats like CSV or XLSX, which are limited in their ability to store multi-dimensional data. This multidimensional array can be used by code and software as it is without having to do any pre-processing.

3D variable#

depths <- c(0,10,20,30,50,100)
latitudes <- c(78.5271,79.2316,80.3261)
longitudes <- c(30.1515,28.5810)
sea_water_temperature <- runif(length(depths) * length(latitudes) * length(longitudes), min = 0, max = 2)

# Reshape the sea water temperature values to match the depth, latitude, and longitude dimensions
sea_water_temperature <- array(sea_water_temperature, dim = c(length(depths), length(latitudes), length(longitudes)))
print(sea_water_temperature)

ncds <- create.nc("../data/exported_from_notebooks/3d_sea_water_temperature.nc")

# Dimensions and coordinate variables
dim.def.nc(ncds,"depth",length(depths))
dim.def.nc(ncds,"latitude",length(latitudes))
dim.def.nc(ncds,"longitude",length(longitudes))

var.def.nc(ncds,"depth","NC_INT","depth")
var.def.nc(ncds,"latitude","NC_DOUBLE","latitude")
var.def.nc(ncds,"longitude","NC_DOUBLE","longitude")

var.put.nc(ncds, "depth", depths)
var.put.nc(ncds, "latitude", latitudes)
var.put.nc(ncds, "longitude", longitudes)

# Data variable with 3 dimensions
var.def.nc(ncds, "sea_water_temperature", "NC_DOUBLE", c("depth", "latitude", "longitude"))
var.put.nc(ncds, "sea_water_temperature", sea_water_temperature)

print.nc(ncds)
, , 1

          [,1]      [,2]      [,3]
[1,] 0.7378683 0.7632724 1.3497407
[2,] 1.1297617 0.7281608 0.3089237
[3,] 0.8852834 0.8071538 1.4563060
[4,] 1.6461949 1.4149698 0.3467044
[5,] 1.2651997 1.6330159 1.9648974
[6,] 0.4972041 1.0990290 1.9946141

, , 2

          [,1]      [,2]      [,3]
[1,] 1.9005102 1.0983505 1.4121690
[2,] 1.2623425 0.7391690 1.5924651
[3,] 1.1554081 0.8991226 0.2394562
[4,] 0.9192146 1.0971142 0.9754731
[5,] 0.3057705 0.3559349 0.8854389
[6,] 1.3503461 0.4055826 0.5901712
netcdf classic {
dimensions:
	depth = 6 ;
	latitude = 3 ;
	longitude = 2 ;
variables:
	NC_INT depth(depth) ;
	NC_DOUBLE latitude(latitude) ;
	NC_DOUBLE longitude(longitude) ;
	NC_DOUBLE sea_water_temperature(depth, latitude, longitude) ;
}

3D data from data frame#

What if you have your data in Excel or a CSV file or some other tabular format? We can load in the data to a dataframe (above) and then convert the data to a 3D array.

I’ll create a dummy dataframe here.

depths <- c(0,10,20,30,50,100)
latitudes <- c(78.5271,79.2316,80.3261)
longitudes <- c(30.1515,28.5810)

# Create lists to store the coordinates and salinity values
depth_coordinates <- c()
latitude_coordinates <- c()
longitude_coordinates <- c()
salinity_values <- c()

# Generate the coordinates and salinity values for the grid
for (d in depths) {
  for (lat in latitudes) {
    for (lon in longitudes) {
      depth_coordinates <- c(depth_coordinates, rep(d, 1))
      latitude_coordinates <- c(latitude_coordinates, rep(lat, 1))
      longitude_coordinates <- c(longitude_coordinates, rep(lon, 1))
      salinity <- runif(1, min = 30, max = 35)  # Random salinity value between 30 and 35
      salinity_values <- c(salinity_values, salinity)
    }
  }
}

# Create a DataFrame
data <- data.frame(
  Depth = depth_coordinates,
  Latitude = latitude_coordinates,
  Longitude = longitude_coordinates,
  Salinity = salinity_values
)

head(data)
A data.frame: 6 × 4
DepthLatitudeLongitudeSalinity
<dbl><dbl><dbl><dbl>
1078.527130.151534.95121
2078.527128.581031.38647
3079.231630.151533.82106
4079.231628.581032.84098
5080.326130.151533.13868
6080.326128.581034.76885

Now, let’s create a multidimensional grid for our salinity variable. We need to be a bit careful with the order here. The dataframe is sorted first by depth (6 depths), then by latitude (3 latitudes), then by longitude (2 longitudes). We should mirror that order.

salinity_3d_array <- array(data$Salinity, dim = c(length(depths), length(latitudes),  length(longitudes)))
print(salinity_3d_array)
, , 1

         [,1]     [,2]     [,3]
[1,] 34.95121 33.93619 33.35741
[2,] 31.38647 32.72915 34.35160
[3,] 33.82106 34.21903 33.34154
[4,] 32.84098 30.79808 33.77941
[5,] 33.13868 34.84651 32.41679
[6,] 34.76885 33.91881 30.42228

, , 2

         [,1]     [,2]     [,3]
[1,] 31.31581 33.92391 33.07140
[2,] 32.56664 33.69165 33.14205
[3,] 32.51224 32.38099 32.72667
[4,] 30.85710 32.91478 33.45412
[5,] 34.85481 30.21363 31.70215
[6,] 31.64318 32.51365 34.49260
ncds <- create.nc("../data/exported_from_notebooks/3d_sea_water_salinity.nc")

# Dimensions and coordinate variables
dim.def.nc(ncds,"depth",length(depths))
dim.def.nc(ncds,"latitude",length(latitudes))
dim.def.nc(ncds,"longitude",length(longitudes))

var.def.nc(ncds,"depth","NC_INT","depth")
var.def.nc(ncds,"latitude","NC_DOUBLE","latitude")
var.def.nc(ncds,"longitude","NC_DOUBLE","longitude")

var.put.nc(ncds, "depth", depths)
var.put.nc(ncds, "latitude", latitudes)
var.put.nc(ncds, "longitude", longitudes)

# Data variable with 3 dimensions
var.def.nc(ncds, "salinity", "NC_DOUBLE", c("depth", "latitude", "longitude"))
var.put.nc(ncds, "salinity", salinity_3d_array)

print.nc(ncds)
netcdf classic {
dimensions:
	depth = 6 ;
	latitude = 3 ;
	longitude = 2 ;
variables:
	NC_INT depth(depth) ;
	NC_DOUBLE latitude(latitude) ;
	NC_DOUBLE longitude(longitude) ;
	NC_DOUBLE salinity(depth, latitude, longitude) ;
}

Metadata (attributes)#

Hurrah! Your data are in the xarray dataset object. But are you ready to export a NetCDF file? Will that file be compliant with the FAIR principles? No! We need metadata.

Variable attributes are metadata that describe the variables. Global attributes are metadata that describe the file as a whole. You can find a list of attributes here provided by the Climate & Forecast (CF) conventions: https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#attribute-appendix

The table in the link above specifies which attributes can be used as global attributes and which can be used as variable attributes. Some attributes can be used as either.

The CF conventions are light on discovery metadata. Discovery metadata are metadata that can be used to find data. For example, when and where the data were collected and by whom, some keywords etc. So we also use the ACDD convention - The Attribute Convention for Data Discovery. https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3

This is a list of recommendations. SIOS advises that people follow the requirements of the Arctic Data Centre, here. Requirements are a more effective way to encourage consistency than recommendations. These requirements are compliant with the ACDD conventions: https://adc.met.no/node/4

Variable attributes#

The CF conventions provide examples of which variable attributes you should be including in your CF-NetCDF file. For example for latitude: https://cfconventions.org/Data/cf-conventions/cf-conventions-1.10/cf-conventions.html#latitude-coordinate

Let’s replicate that setup.

Additionally, the ACDD convention recommends that and attribute coverage_content_type is also added, which is used to state whether the data are modelResult, physicalMeasurement or something else, see the list here: https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3#Highly_Recommended_Variable_Attributes

And remember we might want to select additional applicable attributes for our variables from this section of the CF conventions: https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#attribute-appendix

att.put.nc(ncds, "latitude", "standard_name", "NC_CHAR", "latitude")
att.put.nc(ncds, "latitude", "long_name", "NC_CHAR", "latitude")
att.put.nc(ncds, "latitude", "units", "NC_CHAR", "degrees_north")
att.put.nc(ncds, "latitude", "coverage_content_type", "NC_CHAR", "coordinate")

att.put.nc(ncds, "longitude", "standard_name", "NC_CHAR", "longitude")
att.put.nc(ncds, "longitude", "long_name", "NC_CHAR", "longitude")
att.put.nc(ncds, "longitude", "units", "NC_CHAR", "degrees_east")
att.put.nc(ncds, "longitude", "coverage_content_type", "NC_CHAR", "coordinate")

att.put.nc(ncds, "depth", "standard_name", "NC_CHAR", "depth")
att.put.nc(ncds, "depth", "long_name", "NC_CHAR", "depth below sea level")
att.put.nc(ncds, "depth", "units", "NC_CHAR", "meters")
att.put.nc(ncds, "depth", "coverage_content_type", "NC_CHAR", "coordinate")
att.put.nc(ncds, "depth", "positive", "NC_CHAR", "down")

att.put.nc(ncds, "salinity", "standard_name", "NC_CHAR", "sea_water_salinity")
att.put.nc(ncds, "salinity", "long_name", "NC_CHAR", "a description about the variable in your own words")
att.put.nc(ncds, "salinity", "units", "NC_CHAR", "psu")
att.put.nc(ncds, "salinity", "coverage_content_type", "NC_CHAR", "modelResult")

print.nc(ncds)
netcdf classic {
dimensions:
	depth = 6 ;
	latitude = 3 ;
	longitude = 2 ;
variables:
	NC_INT depth(depth) ;
		NC_CHAR depth:standard_name = "depth" ;
		NC_CHAR depth:long_name = "depth below sea level" ;
		NC_CHAR depth:units = "meters" ;
		NC_CHAR depth:coverage_content_type = "coordinate" ;
		NC_CHAR depth:positive = "down" ;
	NC_DOUBLE latitude(latitude) ;
		NC_CHAR latitude:standard_name = "latitude" ;
		NC_CHAR latitude:long_name = "latitude" ;
		NC_CHAR latitude:units = "degrees_north" ;
		NC_CHAR latitude:coverage_content_type = "coordinate" ;
	NC_DOUBLE longitude(longitude) ;
		NC_CHAR longitude:standard_name = "longitude" ;
		NC_CHAR longitude:long_name = "longitude" ;
		NC_CHAR longitude:units = "degrees_east" ;
		NC_CHAR longitude:coverage_content_type = "coordinate" ;
	NC_DOUBLE salinity(depth, latitude, longitude) ;
		NC_CHAR salinity:standard_name = "sea_water_salinity" ;
		NC_CHAR salinity:long_name = "a description about the variable in your own words" ;
		NC_CHAR salinity:units = "psu" ;
		NC_CHAR salinity:coverage_content_type = "modelResult" ;
}

Global attributes#

As mentioned above, the requirements of the Arctic Data Centre for global attributes (based on the ACDD convention) can serve as a guide for which global attributes you should be including. https://adc.met.no/node/4

And remember we might want to select additional applicable global attributes from this section of the CF conventions: https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#attribute-appendix

Go through and add each required attribute and any others you wish to. You are also welcome to add any custom attributes on top of these requirements.

In RNetCDF, the syntax for adding a global attribute is the same as for adding a variable attribute, but we use a special variable name NC_GLOBAL.

# Define the global attributes as an R list
attributes <- list(
  id = "your_unique_id_here",
  naming_authority = "institution that provides the id",
  title = "my title",
  summary = "analagous to an abstract in the paper, describing the data and how they were collected and processed",
  creator_type = "person",
  creator_name = "John Smith; Luke Marsden", # Who collected and processed the data up to this point
  creator_email = "johns@unis.no; lukem@met.no",
  creator_institution = "The University Centre in Svalbard (UNIS); Norwegian Meteorological Institute (MET)",
  creator_url = "; https://orcid.org/0000-0002-9746-544X", # OrcID is best practice if possible. Other URLs okay, or leave blank for authors that don't have one.
  time_coverage_start = "2020-05-10T08:14:58Z",
  time_coverage_end = "2020-05-10T11:51:12Z",
  keywords = "sea_water_salinity",
  keywords_vocabulary = "CF:NetCDF COARDS Climate and Forecast Standard Names",
  institution = "Your Institution",
  publisher_name = "Publisher Name", # Data centre where your data will be published
  publisher_email = "publisher@email.com",
  publisher_url = "publisher_url_here",
  license = "https://creativecommons.org/licenses/by/4.0/",
  Conventions = "ACDD-1.3, CF-1.8", # Choose which ever version you will check your file against using a compliance checker
  project = "Your project name"
)

# Loop through the attributes and add them to the NetCDF file. 
for (key in names(attributes)) {
  att.put.nc(ncds, "NC_GLOBAL", key, "NC_CHAR", attributes[[key]])
}

# These attributes are all "NC_CHAR" format. You will need to adjust the code if writing other formats.

print.nc(ncds)
netcdf classic {
dimensions:
	depth = 6 ;
	latitude = 3 ;
	longitude = 2 ;
variables:
	NC_INT depth(depth) ;
		NC_CHAR depth:standard_name = "depth" ;
		NC_CHAR depth:long_name = "depth below sea level" ;
		NC_CHAR depth:units = "meters" ;
		NC_CHAR depth:coverage_content_type = "coordinate" ;
		NC_CHAR depth:positive = "down" ;
	NC_DOUBLE latitude(latitude) ;
		NC_CHAR latitude:standard_name = "latitude" ;
		NC_CHAR latitude:long_name = "latitude" ;
		NC_CHAR latitude:units = "degrees_north" ;
		NC_CHAR latitude:coverage_content_type = "coordinate" ;
	NC_DOUBLE longitude(longitude) ;
		NC_CHAR longitude:standard_name = "longitude" ;
		NC_CHAR longitude:long_name = "longitude" ;
		NC_CHAR longitude:units = "degrees_east" ;
		NC_CHAR longitude:coverage_content_type = "coordinate" ;
	NC_DOUBLE salinity(depth, latitude, longitude) ;
		NC_CHAR salinity:standard_name = "sea_water_salinity" ;
		NC_CHAR salinity:long_name = "a description about the variable in your own words" ;
		NC_CHAR salinity:units = "psu" ;
		NC_CHAR salinity:coverage_content_type = "modelResult" ;

// global attributes:
		NC_CHAR :id = "your_unique_id_here" ;
		NC_CHAR :naming_authority = "institution that provides the id" ;
		NC_CHAR :title = "my title" ;
		NC_CHAR :summary = "analagous to an abstract in the paper, describing the data and how they were collected and processed" ;
		NC_CHAR :creator_type = "person" ;
		NC_CHAR :creator_name = "John Smith; Luke Marsden" ;
		NC_CHAR :creator_email = "johns@unis.no; lukem@met.no" ;
		NC_CHAR :creator_institution = "The University Centre in Svalbard (UNIS); Norwegian Meteorological Institute (MET)" ;
		NC_CHAR :creator_url = "; https://orcid.org/0000-0002-9746-544X" ;
		NC_CHAR :time_coverage_start = "2020-05-10T08:14:58Z" ;
		NC_CHAR :time_coverage_end = "2020-05-10T11:51:12Z" ;
		NC_CHAR :keywords = "sea_water_salinity" ;
		NC_CHAR :keywords_vocabulary = "CF:NetCDF COARDS Climate and Forecast Standard Names" ;
		NC_CHAR :institution = "Your Institution" ;
		NC_CHAR :publisher_name = "Publisher Name" ;
		NC_CHAR :publisher_email = "publisher@email.com" ;
		NC_CHAR :publisher_url = "publisher_url_here" ;
		NC_CHAR :license = "https://creativecommons.org/licenses/by/4.0/" ;
		NC_CHAR :Conventions = "ACDD-1.3, CF-1.8" ;
		NC_CHAR :project = "Your project name" ;
}

In this case, it makes sense to add some attributes based on information we have already provided.

att.put.nc(ncds, "NC_GLOBAL", "geospatial_lat_min", "NC_FLOAT", min(lat))
att.put.nc(ncds, "NC_GLOBAL", "geospatial_lat_max", "NC_FLOAT", max(lat))
att.put.nc(ncds, "NC_GLOBAL", "geospatial_lon_min", "NC_FLOAT", min(lon))
att.put.nc(ncds, "NC_GLOBAL", "geospatial_lon_max", "NC_FLOAT", max(lon))

We can include the current time in the date_created and history attributes.

dtnow <- Sys.time()
attr(dtnow, "tzone") <- "UTC"
dt8601 <- format(dtnow, "%Y-%m-%dT%H:%M:%SZ") # date and time in ISO 8601 format
att.put.nc(ncds, "NC_GLOBAL", "date_created", "NC_CHAR", dt8601)
history <- paste("File created at", dtnow, "using RNetCDF by Luke Marsden")
att.put.nc(ncds, "NC_GLOBAL", "history", "NC_CHAR", history)
print.nc(ncds)
netcdf classic {
dimensions:
	depth = 6 ;
	latitude = 3 ;
	longitude = 2 ;
variables:
	NC_INT depth(depth) ;
		NC_CHAR depth:standard_name = "depth" ;
		NC_CHAR depth:long_name = "depth below sea level" ;
		NC_CHAR depth:units = "meters" ;
		NC_CHAR depth:coverage_content_type = "coordinate" ;
		NC_CHAR depth:positive = "down" ;
	NC_DOUBLE latitude(latitude) ;
		NC_CHAR latitude:standard_name = "latitude" ;
		NC_CHAR latitude:long_name = "latitude" ;
		NC_CHAR latitude:units = "degrees_north" ;
		NC_CHAR latitude:coverage_content_type = "coordinate" ;
	NC_DOUBLE longitude(longitude) ;
		NC_CHAR longitude:standard_name = "longitude" ;
		NC_CHAR longitude:long_name = "longitude" ;
		NC_CHAR longitude:units = "degrees_east" ;
		NC_CHAR longitude:coverage_content_type = "coordinate" ;
	NC_DOUBLE salinity(depth, latitude, longitude) ;
		NC_CHAR salinity:standard_name = "sea_water_salinity" ;
		NC_CHAR salinity:long_name = "a description about the variable in your own words" ;
		NC_CHAR salinity:units = "psu" ;
		NC_CHAR salinity:coverage_content_type = "modelResult" ;

// global attributes:
		NC_CHAR :id = "your_unique_id_here" ;
		NC_CHAR :naming_authority = "institution that provides the id" ;
		NC_CHAR :title = "my title" ;
		NC_CHAR :summary = "analagous to an abstract in the paper, describing the data and how they were collected and processed" ;
		NC_CHAR :creator_type = "person" ;
		NC_CHAR :creator_name = "John Smith; Luke Marsden" ;
		NC_CHAR :creator_email = "johns@unis.no; lukem@met.no" ;
		NC_CHAR :creator_institution = "The University Centre in Svalbard (UNIS); Norwegian Meteorological Institute (MET)" ;
		NC_CHAR :creator_url = "; https://orcid.org/0000-0002-9746-544X" ;
		NC_CHAR :time_coverage_start = "2020-05-10T08:14:58Z" ;
		NC_CHAR :time_coverage_end = "2020-05-10T11:51:12Z" ;
		NC_CHAR :keywords = "sea_water_salinity" ;
		NC_CHAR :keywords_vocabulary = "CF:NetCDF COARDS Climate and Forecast Standard Names" ;
		NC_CHAR :institution = "Your Institution" ;
		NC_CHAR :publisher_name = "Publisher Name" ;
		NC_CHAR :publisher_email = "publisher@email.com" ;
		NC_CHAR :publisher_url = "publisher_url_here" ;
		NC_CHAR :license = "https://creativecommons.org/licenses/by/4.0/" ;
		NC_CHAR :Conventions = "ACDD-1.3, CF-1.8" ;
		NC_CHAR :project = "Your project name" ;
		NC_FLOAT :geospatial_lat_min = 80.3261032104492 ;
		NC_FLOAT :geospatial_lat_max = 80.3261032104492 ;
		NC_FLOAT :geospatial_lon_min = 28.5809993743896 ;
		NC_FLOAT :geospatial_lon_max = 28.5809993743896 ;
		NC_CHAR :date_created = "2024-05-31T09:27:13Z" ;
		NC_CHAR :history = "File created at 2024-05-31 09:27:13.680088 using RNetCDF by Luke Marsden" ;
}

Finally, we close the file.

close.nc(ncds)

Checking your data#

Make sure you thoroughly check your file and it ideally should be run past all co-authors, just like when publishing a paper.

There are also validators you can run your files by to make sure that you file is compliant with the ACDD and CF conventions before you publish it. For example: https://compliance.ioos.us/index.html

How to cite this course#

If you think this course contributed to the work you are doing, consider citing it in your list of references. Here is a recommended citation:

Marsden, L. (2024, May 31). NetCDF in R - from beginner to pro. Zenodo. https://doi.org/10.5281/zenodo.11400754

And you can navigate to the publication and export the citation in different styles and formats by clicking the icon below.

DOI