Extracting Data from NC Files.
I was asked about extracting columns or fields of data from the HadCm3 model output when there are lots of files to be dealt with and maybe you don’t want them all. This python script achieves that and places a file fieldName.txt
in the OutputFolder
folder.
#Example Usage
#./ncks_extraction.py "fieldName" "InputFolder" "OutputFolder"
The script requires the following standard imports and uses the os > system
so we can run system commands from within Python.
#System Imports
import glob,sys,re
from os import system
The following function actually extracts the column from the ncfiles using ncks. Ncks is part of the NCO package.
#FUNCTION for extracting from nc into text file.
def ncksExtraction(nameOfField,file,outFile):
command = "ncks -H -C -v %s -d latitude,14,14,1 -d longitude,20,20,1 %s >> %s"
% (nameOfField,file,outFile)
print(command)
system(command)
return 1
In the main program we start by doing som housekeeping with the command line arguments so that we ensure the program runs correctly.
## MAIN PROGRAM - Extract a single field from raw NC files and combining into a textfile##
#check we have the right usage
if len(sys.argv) != 4:
sys.exit("Usage: ncks_extraction.py 'nameOfField' inputfolder outfolder");
#get our variables from the command line.
nameOfField = sys.argv[1]
inFolder = sys.argv[2]
outFolder = sys.argv[3]
print ("Extracting " + nameOfField + " from " + inFolder + " into " + outFolder)
The following lines of code load all the files from the folder that we passed as inFolder on the command line. *.nc
tells glob only to load files of the nc type into the array.
#get the files into an array
inFiles = glob.glob(inFolder + "/*.nc")
If you wanted to limit the files processed by Ncks then this following line will do that. On the basis of the order the files have been loaded into the array. (It ‘appears’ to be time-created) the following will trim the array to the last 5. You can read about Python splicing in the documentation.
#Comment out this line if you don't want to restrict which files you works with.
inFiles = inFiles[-5:]
Best now check we have some files left.
#Check we have some files
if len(inFiles) == 0:
sys.exit("No files in source folder");
else:
print(len(inFiles))
Because we are appending data to the outfile with the >>
command (see the ncksExtraction() method) we must reset the file. We can do this by passing it /dev/null
.
#Reset our output file
outFile = outFolder + "/" + nameOfField + ".txt"
system ("mkdir " + outFolder)
system ("touch " + outFile)
system ("cat %s" %(outFile))
Now for all the files left in our list lets process them!!
#foreach file in the array process it
for file in inFiles:
ncksExtraction(nameOfField,file,outFile)