awk: splitting an XML file into coherent pieces

So i have this huge XML file that needs to be split in small pieces. But the files should contain coherent messages like:

<MSG>
...
</MSG>

I found the following command to split the file, the root tag for the message is called it splits a file called logfile into logfilesegements that contain exactly a single message.

oehoeboeroe:Resources aqua$ awk '/
close(output); output= f n++} n {print >> output }'
f=logfilesegment logfile.xml
oehoeboeroe:Resources aqua$ l logfile*
-rwxr-xr-x  1 aqua  aqua  541115  6 Jan 19:41 logfile.xml
-rw-r--r--  1 aqua  aqua   41315 19 Jan 17:07 logfilesegment0
-rw-r--r--  1 aqua  aqua   36490 19 Jan 17:07 logfilesegment1
-rw-r--r--  1 aqua  aqua   34069 19 Jan 17:07 logfilesegment10
-rw-r--r--  1 aqua  aqua   39082 19 Jan 17:07 logfilesegment11
-rw-r--r--  1 aqua  aqua   40735 19 Jan 17:07 logfilesegment12
-rw-r--r--  1 aqua  aqua   40562 19 Jan 17:07 logfilesegment13
-rw-r--r--  1 aqua  aqua     125 19 Jan 17:07 logfilesegment14
-rw-r--r--  1 aqua  aqua     374 19 Jan 17:07 logfilesegment15
-rw-r--r--  1 aqua  aqua     806 19 Jan 17:07 logfilesegment16
-rw-r--r--  1 aqua  aqua   35987 19 Jan 17:07 logfilesegment2
-rw-r--r--  1 aqua  aqua   40717 19 Jan 17:07 logfilesegment3
-rw-r--r--  1 aqua  aqua   39323 19 Jan 17:07 logfilesegment4
-rw-r--r--  1 aqua  aqua   35355 19 Jan 17:07 logfilesegment5
-rw-r--r--  1 aqua  aqua   38024 19 Jan 17:07 logfilesegment6
-rw-r--r--  1 aqua  aqua   39250 19 Jan 17:07 logfilesegment7
-rw-r--r--  1 aqua  aqua   38360 19 Jan 17:07 logfilesegment8
-rw-r--r--  1 aqua  aqua   40541 19 Jan 17:07 logfilesegment9
oehoeboeroe:Resources aqua$ cat logfilesegment14
<MSG_VesselData>
<Header MsgRefId="{2b8d37ca-0000-4000-95bc-2dad915499b9}" Version="1.0" />
<Body>
</Body>
</MSG_VesselData>
oehoeboeroe:Resources aqua$

Advertisements
Tagged with:
Posted in Code

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories
%d bloggers like this: