Programming and Application(编程与应用)


Content(目录)




Linux


MySQL
Office















 
PCNow 30-Day Free Trial, Remote PC Access
 
Logo_234x60

Use PERL to processing CSV file as plain text file
Use PERL to processing CSV file as plain text file

There are several modules used to process CSV file in PERL. Please refer to this tutorial to learn more about it. It describes several ways to parse CSV file, such as using Text::CSV module, Tie::CSV_File, Tie::Handle::CSV, and DBD::CSV modules. Here what I describe a very simple way to process CSV without using any modules.

Since CSV file is plain text file that includes fields separated by comma, to process it should be pretty simple and easy. We can just read line by line and parsing them with no difficulty. The procedure I present here is PERL script to handle CSV file. User should provide input filename and output filename through command line. An additional parameter ($layer) is used to pick up right column.

Here is a sample of input file:

CL a b c S Weight T1 Count
1 0.08 0.072 0.08 11 0 4.91477 13
3 0.08 0.072 0.08 11 0 5.51728 25
2 0.08 0.072 0.08 11 0 5.566 14

What we want to do is to rearrange data fields in the file and also skip some fields. A sample of output looks like the following.

a CL Weight T1
0.08 1 0 4.91477
0.08 3 0 5.51728
0.08 2 0 5.566

The above data is just for demonstration. There is any meaning in the context. As sample, the data files is very small. You might ask why do you use Excel to process it. In reality, we might have a huge file way beyond Excel's capacity to handle. Secondarily, we might have hundreds of small CSV files with same format. If you process them with Excel one by one, it will be to tedious to do.

The following is the PERL script to process these type of CSV data file.


#!\perl\bin\perl
$numArgs = $#ARGV + 1;
if ($numArgs != 3) { exit; }
my $infile = $ARGV[0];
my $outfile = $ARGV[1];
my $layer = $ARGV[2];

# open output file
open(MYOUT, "> $outfile") or die "cannot open $outfile: $!\n";
# read input file
open(MYIN, "< $infile") or die "cannot open $infile: $!\n";
my $linecount = 0;

while() {
# Good practice to store $_ value because
# subsequent operations may change it.
my ($line) = $_;

# Good practice to always strip the trailing
# newline from the line.
chomp($line);

$linecount++;

print MYOUT processLine($line);
}

close(MYIN);
close(MYOUT);

sub processLine {
my @elem = split(/,/,$_);
my $size = @elem;
my $mytraitvalue;
$mytraitvalue = $elem[$layer];
if ($mytraitvalue + 1 > 1) { $mytraitvalue = $elem[$layer]*100; }
my $returnStr = "$mytraitvalue,$elem[0]";
for (my $i=5; $i<$size-1; $i++) {
$returnStr .= ",$elem[$i]";
}
$returnStr .= "\n";
return $returnStr;
}
©董占山Zhanshan Dong

Post comments(留言)

Name(名字):

Comment(内容):


由Google提供

SunfineData Products|U's Bargain Network|Contact Me(与我联系)
© 1998-, 董占山, 版权所有, 欢迎转载文章链接。
转载文章和软件请注明出处(http://articles.sunfinedata.com/)。