Welcome to MLink Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
317 views
in Technique[技术] by (71.8m points)

Extract data from .OUT file between 2 strings and create a new csv file using Python

My data is like below - stored in a .OUT file:

{ID=ISIN Name=yes PROGRAM=abc START_of_FIELDS CODE END-OF-FIELDS TIMESTARTED=Mon Nov 30 20:45:56
   START-OF-DATA 
CODE|ERR CODE|NUM|EXCH_CODE|
912828U rp|0|1|BERLIN|
1392917 rp|0|1|IND| 
3CB0248 rp|0|1|BRAZIL| 
END-OF-DATA***}

I need to extract the lines between START-OF-DATA and END-OF-DATA from above .OUT file using Python and load it in CSV file.

CODE|ERR CODE|NUM|EXCH_CODE|
912828U rp|0|1|BERLIN|
1392917 rp|0|1|IND|
3CB0248 rp|0|1|FRANKFURT|

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can use non greedy quantifier regex to get the entries between two strings.

with open('file.txt', 'r') as file:
    data = file.read()
pattern = pattern = re.compile(r'(?:START-OF-DATA(.*?)END-OF-DATA)', re.MULTILINE|re.IGNORECASE | re.DOTALL)
g = re.findall(pattern,data)

O/P
[' 
CODE|ERR CODE|NUM|EXCH_CODE|
912828U rp|0|1|BERLIN|
1392917 rp|0|1|IND| 
3CB0248 rp|0|1|BRAZIL| 
']

#remove whitespaces and split by new line and remove empty entries of list
t = g[0].replace(" ","").split("
")
new = list(filter(None, t))

O/P
['CODE|ERRCODE|NUM|EXCH_CODE|', '912828Urp|0|1|BERLIN|', '1392917rp|0|1|IND|', '3CB0248rp|0|1|BRAZIL|']

#create dataframe with pipe demoted
df = pd.DataFrame([i.split('|') for i in new])

O/P

           0        1    2          3 
0       CODE  ERRCODE  NUM  EXCH_CODE  
1  912828Urp        0    1     BERLIN  
2  1392917rp        0    1        IND  
3  3CB0248rp        0    1     BRAZIL 

#create csv from df
df.to_csv('file.csv') 

The regex pattern defined here will capture everything whenever a match is found for a string that begins with "START-OF-DATA" and ends with "END-OF-DATA" and leave you its output


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to MLink Developer Q&A Community for programmer and developer-Open, Learning and Share
...