Bulk upload files to AWS S3 bucket

|
| By Webner

How to bulk upload files to AWS S3 Bucket (using Laravel)

Amazon Simple Storage Service (Amazon S3) is used to store and retrieve any amount of data, at any time, from anywhere on the web. While storing data to AWS S3 bucket one way is to simply put the files in S3 one by one using put method of S3 like:

$s3 = Storage::disk('s3');
foreach($files as $file){ 
$s3->put($file); 
}

Here,

Storage is a class in the laravel that is used to interact with any of the disk.

disk is the static function of Storage class that connects with the particular disk which is mentioned in its parameter like S3 is the disk here with which we are connecting to. If we do not use this method then it will consider default disk and performs all operations on that.

put is also a method of Storage class that puts the file on particular disk.

But when the data is too large and contains so many files then storing files to S3 one by one will be a very time consuming process. It takes a lot of time for even smaller files.

Therefore to avoid this situation, we can bulk upload to S3 using AWS CLI, which is a tool that provides a set of simple file commands for efficient file transfers to and from Amazon S3.

AWS CLI performs recursive uploads of multiple files in a single folder-level command by transfering files in parallel for increased performance.

NOTE:
AWS CLI does not come by default with Laravel instead we need to install its package. There are two ways to install it on Linux, Windows , MAC either from pip or using the bundled installer.

Example to bulk upload images in a folder in S3 bucket in php laravel:

Step 1. In .env file of laravel, add the AWS details like

	
AWS_ACCESS_KEY_ID=ACCESSKEYID
AWS_SECRET_ACCESS_KEY=SecRETAcCeSskEY
AWS_DEFAULT_REGION=us-east-1
AWS_BUCKET=aws_bucket_name

Here, AWS_ACCESS_KEY_ID will contain the access key Id of AWS S3 that can be found easily from the AWS account.
AWS_SECRET_ACCESS_KEY will contain the secret access key of AWS.
AWS_DEFAULT_REGION will contain default region of your AWS.
AWS_BUCKET will contain the bucket name where we want to store the files.

We provide these credentials in .env for security purpose as this file cannot be accessed from outside.

Step 2. In filesystems.php under config folder,
(i) Add the driver to be used which will contain AWS S3 bucket credentials which we already mentioned in the .env file. We provide credentials from .env file for S3 bucket so that it can be secured.

 's3' => [
            'driver' => 's3',
            'key' => env('AWS_ACCESS_KEY_ID'),
            'secret' => env('AWS_SECRET_ACCESS_KEY'),
            'region' => env('AWS_DEFAULT_REGION'),
            'bucket' => env('AWS_BUCKET'),
            'url' => env('AWS_URL'),
        ],

Here, we have mentioned ‘s3’ as the name for S3 driver. The credentials are to be provided through .env file like env(‘AWS_ACCESS_KEY_ID’) means the access key we have mentioned in the .env file for AWS_ACCESS_KEY_ID will be used here.

(ii) Provide the path where image files are stored locally:

	'storage_folder' => [
    		'driver' => 'local',
    		'root'   => storage_path().'/File_storage',
    	],

Here,
storage_path() function will provide the path to the storage driver of laravel.
File_storage is the name of the folder that will be created inside storage folder of laravel to store the images locally.

Step 3. This is the code that stores all the files to S3 at once using AWS CLI command.
In this below code, initially we are storing the files in a folder on our local storage (storage folder of laravel). When the files get stored to our local storage then we will use AWS CLI command to move the folder (where files are stored) to S3. In this way we are using a single command of AWS CLI that will move the whole folder structure to S3 bucket.

$folder_disk = Storage::disk ( 'storage_folder' );
$curl_handle = curl_init ();
curl_setopt ( $curl_handle, CURLOPT_URL, $imagefile ['url'] );
curl_setopt ( $curl_handle, CURLOPT_CONNECTTIMEOUT, 2 );
curl_setopt ( $curl_handle, CURLOPT_RETURNTRANSFER, 1 );
$image_file = curl_exec ( $curl_handle );
curl_close ( $curl_handle );
$path = $folder_name . '/' . $name . '.' . $extension;
$folder_disk->put ( $path, $imagefile );
$process = new Process("aws S3 mv '".storage_path()."/File_storage' S3://folder-storage-example --recursive --acl public-read");
$process->setTimeout(0);
$process->run();
if ($process->isSuccessful())
{
Log::info("Files moved to S3.");
}else{
	Log::info("Files not moved to S3.----->".$process->getErrorOutput());
}

Here,

$folder_disk = Storage::disk ( 'storage_folder' );

This line gets the driver (where we are going to store files locally before moving to S3) from our storage disk which we have already mentioned in our filesystems.php.

$curl_handle = curl_init ();
curl_setopt ( $curl_handle, CURLOPT_URL, $file ['url'] );
curl_setopt ( $curl_handle, CURLOPT_CONNECTTIMEOUT, 2 );
curl_setopt ( $curl_handle, CURLOPT_RETURNTRANSFER, 1 );
$image_file = curl_exec ( $curl_handle );
curl_close ( $curl_handle );


In above lines of code, we are using curl to get the image files from their urls.

$imagefile [‘url’] is giving the url of the image file that needs to store.

$path = $folder_name . '/' . $name . '.' . $extension;
$folder_disk->put ( $path, $imagefile );

In above lines,

$path is the path that provides the structure where we need to store imagefile in the local storage.

Then we are storing the imagefile in the local storage by using put method along with the storage disk that we mentioned in filesystems.php. In this way, suppose if image file name is ‘example’ with extension ‘.png’ and if folder_name is ‘Images’ then in our storage folder, we will have another folder ‘Images’ inside which ‘example.png’ will be stored.

$process = new Process("aws S3 mv '".storage_path()."/File_storage' S3://folder-storage-example --recursive --acl public-read");
$process->setTimeout(0);
$process->run();
if ($process->isSuccessful())
{
Log::info("Files moved to S3.");
}else{
	Log::info("Files not moved to S3.----->".$process->getErrorOutput());
}

In above lines, we are using Process component that executes commands in sub-processes. We are simply creating an object of Process class and passing the command to run in it.
Note:
Process class is not by default present in laravel. We need to install it by either using command:

	$ composer require symfony/process

Or by cloning the https://github.com/symfony/process repository.

This is the command we have used that is moving the File_storage folder in the S3 bucket on S3://folder-storage-example url by recursive upload:

aws S3 mv '".storage_path()."/File_storage' S3://folder-storage-example --recursive --acl public-read

Here,

aws is used in each AWS CLI command to refer to AWS.
s3 is the driver which is being used to move.
mv is the command to move the folders.
‘”.storage_path().”/File_storage’ is the folder which we have to move.
s3://folder-storage-example is the destination url where we want to move the folder.
–recursive is used for recursive upload so that it can be uploaded fully.
–acl is access control list (ACL). It is used to set the permissions for an object that already exists in an S3 bucket
public-read is used so that the folder where we are going to move our folder can be publically accessible. We can use it according to the access we want to provide to that folder.

Then we are simply running the process by run() method of Process class and setting its timeout to 0 which means no maximum time limit.

At last we are checking if the process ran successfully or not and logging information accordingly.

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *