[wptelegram-join-channel]
Python program extract decimal number in the range of 00-99 using regular expressions.
Problem Definition
Develop a Python program to read through the lines of the file, find the lines which start with X followed by one or more non-space characters, followed by a: character and decimal number in their range of 00-99. Extract decimal numbers and display. Assume any input file.
Video Tutorial
Step by Step solution to in Problem
First, accept the filename from the user using the input statement and store it in variable fname.
Open the file with an “open” function in reading mode. If the file is not opened then the display file is not found (this can be handled by try and except blocks), otherwise continue with execution.
Use for loop to read the contents of the file line by line. In every iteration, one line will be read.
Use the stript() function to remove the leading and trailing extract characters (extra space and newline characters).
The regular expression ^X\S+:[0-9][0-9].[0-9]+ matches with all lines which start with X (^X part of the regular expression). One or more non-space characters (\S+ part of the regular expression), next to the : character is matched. [0-9][0-9].[0-9]+ is used to match the two-digit number with any number of digits after the decimal point.
First, we use the search function of a regular expression to display the lines. Which checks for the lines which match the regular expression, if there is a match search() function returns True. Then we print the line.
Consider input text file for demonstration say, test.txt with following contents
Name: Mahesh Huddar
From: Karanataka
From: xyz@abc.com
To: pqr@abc.com
X-DSPAM-Confidence:999.8475
X-DSPAM-Probability:0.0000
Mail From: XYZ123@ABC.COM
To: PQR123@ABC.COM
X-DSPAM-Confidence:91.9125
X-DSPAM-Probability:11.0080
Program Source code using regular expression and search() function
import re fname = input("Enter the file name: ") try: fhand = open(fname) for line in fhand: line = line.rstrip() if (re.search("^X\S+:[0-9][0-9]\.[0-9]+", line)): print (line) except: print ("File Not Found")
Output
Enter the file name: test.txt
X-DSPAM-Confidence:99.9125
X-DSPAM-Probability:11.0080
As per the problem statement, only the decimal part of the line should be returned and displayed on the standard output. Hence we need to use findall() function which checks for the match with a given regular expression. If a match is found the matched content is returned.
Program Source code using regular expression and findall() function
import re fname = input("Enter the file name: ") try: fhand = open(fname) for line in fhand: line = line.rstrip() lst = re.findall("^X\S+:([0-9][0-9]\.[0-9]+)", line) if (len(lst)>0): print (lst) except: print ("File Not Found")
Output
Enter the file name: test.txt
[‘99.9125’]
[‘11.0080’]
Summary:
This tutorial discusses how to write a Python program to extract decimal number in the range of 00-99 using regular expressions.