My inbox has a lot of data as many websites are sending notifications and updates. So I tasked myself with creating python script to extract emails from POP3 email server and organize information in better way.
I started from the first step – automatically reading emails from mailbox. Based on examples I have found on Internet I created the script that is retrieving emails and removing not needed information like headers.
I am using web based email (not gmail). In the script I am using poplib module which encapsulates a connection to a POP3 server. Another module that I am using is email – this is a library for managing email messages. As I have many emails I limited for loop to 15 emails.
There are still a few things that can be done. For example I would like to keep “FROM:” data, also some HTML tags still need to be removed. However this code allows to extract body text from emails and can be used as starting point.
Feel free to provide any feedback or suggestions.
Here is the full source code for python script to get body text emails from mailbox.
import poplib
import email
SERVER = "server_name"
USER = "email_address"
PASSWORD = "email_password"
server = poplib.POP3(SERVER)
server.user(USER)
server.pass_(PASSWORD)
numMessages = len(server.list()[1])
if (numMessages > 15):
numMessages=15
for i in range(numMessages) :
(server_msg, body, octets) = server.retr(i+1)
for j in body:
try:
msg = email.message_from_string(j.decode("utf-8"))
strtext=msg.get_payload()
print (strtext)
except:
pass
References
1. Read Email, pop3
2. poplib — POP3 protocol client includes POP3 Example that opens a mailbox and retrieves and prints all messages
3. email — An email and MIME handling package
for i in range(15) :
should be
for i in range(numMessages) :
🙂
Hi Paul,
yes should be in for i in range(numMessages) :. I updated the code. Thanks for identifying this.
Best regards.