![]() |
|
|||
|
An 'awk' skeleton script to parse mails to decide what are email header lines, and which lines make up the body of the mail.
Code:
# awk skeleton to parse mails in mbox format
# empty line separates header from body
/^From/, /^$/ {
printf "\nhead : %s", $0
next
}
/^$/,/^From/ {
if ($1 ~ /^From/) next
printf "\nbody : %s", $0
}
Code:
$ awk -f awk-parse-mails mail-j65 head : From MAILER-DAEMON Thu Feb 24 01:50:56 2011 head : Date: 24 Feb 2011 01:50:56 +0100 head : From: Mail System Internal Data <MAILER-DAEMON@hercules.utp.xnet> head : Subject: DON'T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA head : Message-ID: <1298508656@hercules.utp.xnet> head : X-IMAP: 1275177528 0000000491 head : Status: RO head : body : body : This text is part of the internal format of your mail folder, and is not body : a real message. It is created automatically by the mail system software. body : If deleted, important folder data will be lost, and it will be re-created body : with the data reset to initial values. body : body : head : From j65nko@hercules.utp.xnet Thu Feb 24 03:03:11 2011 head : Received: from hercules.utp.xnet (localhost [127.0.0.1]) head : by hercules.utp.xnet (8.14.3/8.14.3) with ESMTP id p1O23Bmk005438 head : for <j65nko@hercules.utp.xnet>; Thu, 24 Feb 2011 03:03:11 +0100 (CET) head : Received: (from j65nko@localhost) head : by hercules.utp.xnet (8.14.3/8.14.3/Submit) id p1O23B1a025655 head : for j65nko; Thu, 24 Feb 2011 03:03:11 +0100 (CET) head : Date: Thu, 24 Feb 2011 03:03:11 +0100 (CET) head : From: j65nko@hercules.utp.xnet head : Message-Id: <201102240203.p1O23B1a025655@hercules.utp.xnet> head : To: j65nko@hercules.utp.xnet head : Subject: apples head : body : body : I like to eat apples body : head : From j65nko@hercules.utp.xnet Thu Feb 24 03:03:11 2011 head : Received: from hercules.utp.xnet (localhost [127.0.0.1]) head : by hercules.utp.xnet (8.14.3/8.14.3) with ESMTP id p1O23B5W023497 head : for <j65nko@hercules.utp.xnet>; Thu, 24 Feb 2011 03:03:11 +0100 (CET) head : Received: (from j65nko@localhost) head : by hercules.utp.xnet (8.14.3/8.14.3/Submit) id p1O23BHm007707 head : for j65nko; Thu, 24 Feb 2011 03:03:11 +0100 (CET) head : Date: Thu, 24 Feb 2011 03:03:11 +0100 (CET) head : From: j65nko@hercules.utp.xnet head : Message-Id: <201102240203.p1O23BHm007707@hercules.utp.xnet> head : To: j65nko@hercules.utp.xnet head : Subject: oranges head : body : body : I like to eat oranges body : head : From j65nko@hercules.utp.xnet Thu Feb 24 03:03:11 2011 head : Received: from hercules.utp.xnet (localhost [127.0.0.1]) head : by hercules.utp.xnet (8.14.3/8.14.3) with ESMTP id p1O23BXo026743 head : for <j65nko@hercules.utp.xnet>; Thu, 24 Feb 2011 03:03:11 +0100 (CET) [snip] Code:
#!/usr/bin/perl
use strict ;
use warnings ;
while (<>) {
chomp ;
if (/^From/../^$/) {
print "\nhead : $_" ;
next ;
}
if (/^$/.. /^From/) {
if (/^From/) { next } ;
print "\nbody : $_" ;
}
}
Code:
$ perl-parse-mails mail-j65 >results.perl
$ awk -f awk-parse-mails mail-j65 >results.awk
$ diff results.awk results.perl
$ cat -n results.awk | head -5
1
2 head : From MAILER-DAEMON Thu Feb 24 01:50:56 2011
3 head : Date: 24 Feb 2011 01:50:56 +0100
4 head : From: Mail System Internal Data <MAILER-DAEMON@hercules.utp.xnet>
5 head : Subject: DON'T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA
$
__________________
You don't need to be a genius to debug a pf.conf firewall ruleset, you just need the guts to run tcpdump |
|
|||
|
The script and test file for downloading
BTW the emails in the test file were generated with: Code:
for X in apples oranges kiwi\s ; do echo I like to eat $X | mail -s "$X" j65nko ; done
__________________
You don't need to be a genius to debug a pf.conf firewall ruleset, you just need the guts to run tcpdump Last edited by J65nko; 24th February 2011 at 02:49 AM. |
![]() |
| Tags |
| awk, mbox format, parsing mail, perl |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Perl 5.12.3 released | J65nko | News | 0 | 26th January 2011 10:00 AM |
| Perl locale | Theta | OpenBSD General | 3 | 9th January 2009 12:59 PM |
| Learning Perl | mtx | Book reviews | 7 | 22nd October 2008 05:55 PM |
| perl/tk | bsdnewbie999 | OpenBSD Packages and Ports | 4 | 8th August 2008 12:34 AM |
| Perl Script | c0mrade | Programming | 1 | 26th June 2008 05:04 AM |