How to get the “To” email address from eml files?

|
| By Webner

What is an EML File ?

EML is an email message file extension (means an email is stored in a file with an extension .eml). This file is in the standard MIME RFC 822 format by Microsoft Outlook Express and other email programs. EML files can contain plain ASCII text as well as links and attachments.

EML files can be exported for archiving the old emails or for scanning of malware. The Nimda virus is associated with creating EML files. When EML files are available as embedded attachments it is best practice to scan them with an anti-virus before opening.

MIME type: message / rfc822

How to read an EML file in PHP?

There are a couple of ways to do it.

One approach is using the PHP inbuilt extension “php-mime-mail-parser”. The main problem with this extension is that it does not read multipart content (attachments) from the EML files. I’ve tried using it but it didn’t work properly. This will work only to read text / HTML content from EML files, not the attachments.

Another way is to simply do it yourself, it’s not that complicated.

For example, this is the sample code:-

function rglob($pattern='*', $flags = 0, $path=''){
$paths=glob($path.'*', GLOB_MARK|GLOB_ONLYDIR|GLOB_NOSORT);
$files=glob($path.$pattern, $flags);
foreach ($paths as $path) {
$files=array_merge($files,rglob($pattern, $flags, $path));
}
return $files;
}
foreach(rglob("*.eml") as $eml){
$emlContent = file_get_contents($eml);
}

Here’s a script created by me to read the data successfully from eml files. For simplicity, this script will get only the TO: email address from the eml files.

Please keep all the eml files in the same folder where the script is before running the script. Then run the script and it will take a maximum of 1-2 minutes to process 5000+ EML files and will create a CSV file containing unique email ids from all the EMAL files.

<?php
$emails = array();
foreach(rglob("*.eml") as $eml){
$emlContent = file_get_contents($eml);
$data=explode("\n",$emlContent);
$toemailaddress=array();
foreach($data as $key=>$values)
{
$result = substr($values, 0, 4);
if (strpos($result, 'To: ') !== false) {
$toemailaddress[]=$values;
}
}
foreach($toemailaddress as $emailcontent)
{
preg_match_all('/([A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6})/i', $emailcontent, $matches, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($matches[1]); $i++) { $emails[] .= $matches[1][$i]; } } } $emails = array_unique($emails); $fp = fopen('Email.csv', 'w+'); foreach($emails as $values) { fwrite($fp, $values . "\r\n"); } //fputcsv($fp,$emails); fclose($fp); function rglob($pattern='*', $flags = 0, $path=''){ $paths=glob($path.'*', GLOB_MARK|GLOB_ONLYDIR|GLOB_NOSORT); $files=glob($path.$pattern, $flags); foreach ($paths as $path) { $files=array_merge($files,rglob($pattern, $flags, $path)); } return $files; } ?>

Leave a Reply

Your email address will not be published. Required fields are marked *