Posts

Showing posts with the label Amazon S3

Backup AWS Dynamodb To S3

Answer : With introduction of AWS Data Pipeline, with a ready made template for dynamodb to S3 backup, the easiest way is to schedule a back up in the Data Pipeline [link], In case you have special needs (data transformation, very fine grain control ...) consider the answer by @greg There are some good guides for working with MapReduce and DynamoDB. I followed this one the other day and got data exporting to S3 going reasonably painlessly. I think your best bet would be to create a hive script that performs the backup task, save it in an S3 bucket, then use the AWS API for your language to pragmatically spin up a new EMR job flow, complete the backup. You could set this as a cron job. Example of a hive script exporting data from Dynamo to S3: CREATE EXTERNAL TABLE my_table_dynamodb ( company_id string ,id string ,name string ,city string ,state string ,postal_code string) STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler' TB...

Amazon AWS Filezilla Transfer Permission Denied

Answer : To allow user ec2-user (Amazon AWS) write access to the public web directory (/var/www/html), enter this command via Putty or Terminal, as the root user sudo : chown -R ec2-user /var/www/html Make sure permissions on that entire folder were correct: chmod -R 755 /var/www/html Doc's: Setting up amazon ec2-instances Connect to Amazon EC2 file directory using Filezilla and SFTP (Video) Understanding and Using File Permissions if you are using centOs then use sudo chown -R centos:centos /var/www/html sudo chmod -R 755 /var/www/html For Ubuntu sudo chown -R ubuntu:ubuntu /var/www/html sudo chmod -R 755 /var/www/html For Amazon ami sudo chown -R ec2-user:ec2-user /var/www/html sudo chmod -R 755 /var/www/html In my case the /var/www/html in not a directory but a symbolic link to the /var/app/current, so you should change the real directoy ie /var/app/current: sudo chown -R ec2-user /var/app/current sudo chmod -R 755 /var/app/current I hope th...

AWS EFS Vs EBS Vs S3 (differences & When To Use?)

Answer : One word answer: MONEY :D 1 GB to store in US-East-1: (Updated at 2016.dec.20) Glacier: $0.004/Month (Note: Major price cut in 2016) S3: $0.023/Month S3-IA (announced in 2015.09): 0.0125 / M o n t h ( + 0.0125/Month (+ 0.0125/ M o n t h ( + 0.01/gig retrieval charge) EBS: $0.045-0.1/Month (depends on speed - SSD or not) + IOPS costs EFS: $0.3/Month Further storage options, which may be used for temporary storing data while/before processing it: SNS SQS Kinesis stream DynamoDB, SimpleDB The costs above are just samples. There can be differences by region, and it can change at any point. Also there are extra costs for data transfer (out to the internet). However they show a ratio between the prices of the services . There are a lot more differences between these services: EFS is: Generally Available (out of preview), but may not yet be available in your region Network filesystem (that means it may have bigger latency but it c...

AWS CloudFront Access Denied To S3 Bucket

Answer : To assist with your question, I recreated the situation via: Created an Amazon S3 bucket with no Bucket Policy Uploaded public.jpg and make it public via "Make Public" Uploaded private.jpg and kept it private Created an Amazon CloudFront web distribution : Origin Domain Name: Selected my S3 bucket from the list Restrict Bucket Access: Yes Origin Access Identity: Create a New Identity Grant Read Permissions on Bucket: Yes, Update Bucket Policy I checked the bucket, and CloudFront had added a Bucket Policy similar to yours. The distribution was marked as In Progress for a while. Once it said Enabled , I accessed the files via the xxx.cloudfront.net URL: xxx.cloudfront.net/public.jpg redirected me to the S3 URL http://bucketname.s3.amazonaws.com/public.jpg . Yes, I could see the file, but it should not use a redirect. xxx.cloudfront.net/private.jpg redirected me also, but I then received Access Denied because it is a private file in...

CloudFront S3 Access Denied

Answer : Resolved changing my url FROM http://ID.cloudfront.net/bucket/uploads/academy/logo/1/logo.jpg TO http://ID.cloudfront.net/uploads/academy/logo/1/logo.jpg

Auto Create S3 Buckets On Localstack

Answer : A change that came in with this commit since version 0.10.0 . When a container is started for the first time, it will execute files with extensions .sh that are found in /docker-entrypoint-initaws.d . Files will be executed in alphabetical order. You can easily create aws resources on localstack using awslocal (or aws) cli tool in the initialization scripts. version: '3.7' services: localstack: image: localstack/localstack environment: - SERVICES=s3 ports: - "4572:4572" volumes: - ./aws:/docker-entrypoint-initaws.d With a script in directory ./aws/buckets.sh : #!/bin/bash set -x awslocal s3 mb s3://bucket set +x Note: the set [-/+] x is purely there to turn on and off outputting of the commands being executed. Will produce this output: ... localstack_1 | Starting mock S3 (http port 4572)... localstack_1 | Waiting for all LocalStack services to be ready localstack_1 | Ready. localstack_1 | /...

AWS - How To Install Java11 On An EC2 Linux Machine?

Answer : Another option might be running the following commands: In order to install java 11: sudo amazon-linux-extras install java-openjdk11 For java 8 you can try: sudo yum install java-1.8.0-openjdk Finally, if you want to switch between java versions run: sudo alternatives --config java Use one of the OpenJDK distributions: https://docs.aws.amazon.com/corretto/latest/corretto-11-ug/downloads-list.html or https://adoptopenjdk.net/?variant=openjdk11&jvmVariant=hotspot

AmazonS3Client(credentials) Is Deprecated

Answer : You can either use AmazonS3ClientBuilder or AwsClientBuilder as alternatives. For S3, simplest would be with AmazonS3ClientBuilder, BasicAWSCredentials creds = new BasicAWSCredentials("access_key", "secret_key"); AmazonS3 s3Client = AmazonS3ClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(creds)).build(); Use the code listed below to create an S3 client without credentials: AmazonS3 s3Client = AmazonS3ClientBuilder.standard().build(); An usage example would be a lambda function calling S3. You need to pass the region information through the com.amazonaws.regions.Region object. Use AmazonS3Client(credentials, Region.getRegion(Regions.REPLACE_WITH_YOUR_REGION))

AWS Athena MSCK REPAIR TABLE Takes Too Long For A Small Dataset

Answer : MSCK REPAIR TABLE can be a costly operation, because it needs to scan the table's sub-tree in the file system (the S3 bucket). Multiple levels of partitioning can make it more costly, as it needs to traverse additional sub-directories. Assuming all potential combinations of partition values occur in the data set, this can turn into a combinatorial explosion. If you are adding new partitions to an existing table, then you may find that it's more efficient to run ALTER TABLE ADD PARTITION commands for the individual new partitions. This avoids the need to scan the table's entire sub-tree in the file system. It is less convenient than simply running MSCK REPAIR TABLE , but sometimes the optimization is worth it. A viable strategy is often to use MSCK REPAIR TABLE for an initial import, and then use ALTER TABLE ADD PARTITION for ongoing maintenance as new data gets added into the table. If it's really not feasible to use ALTER TABLE ADD PARTITION t...

Check If File Exists In S3 Using Ls And Wildcard

Answer : (re-drafted from comment as it appears this answered the question) I myself tried, and failed to use wildcards in the aws-cli, and according to the docs, this is not currently supported. Simplest (though least efficient) solution would be to use grep: aws s3 ls s3://my-bucket/folder/ | grep myfile Alternatively, you could write a short python/other script to do this more efficiently (but not in a single command) aws s3 ls does not support globs, but sync does and has a dry run mode. So if you do this (from an empty directory) you should get the results you want: aws s3 sync s3://my-bucket . --exclude "*" --include "folder/*myfile*" --dryrun It will produce lines like this for matching files: (dryrun) download s3://my-bucket/folder/myfile.txt to folder/myfile.txt (dryrun) download s3://my-bucket/folder/_myfile-foo.xml to folder/_myfile-foo.xml S3 doesn't support wildcard listing. You need to list all the files and grep it. aws s3 ls s3:...

Amazon S3 Console: Download Multiple Files At Once

Image
Answer : It is not possible through the AWS Console web user interface. But it's a very simple task if you install AWS CLI. You can check the installation and configuration steps on Installing in the AWS Command Line Interface After that you go to the command line: aws s3 cp --recursive s3://<bucket>/<folder> <local_folder> This will copy all the files from given S3 path to your given local path. If you use AWS CLI, you can use the exclude along with --include and --recursive flags to accomplish this aws s3 cp s3://path/to/bucket/ . --recursive --exclude "*" --include "things_you_want" Eg. --exclude "*" --include "*.txt" will download all files with .txt extension. More details - https://docs.aws.amazon.com/cli/latest/reference/s3/ Selecting a bunch of files and clicking Actions->Open opened each in a browser tab, and they immediately started to download (6 at a time).

Aws-sdk S3: Best Way To List All Keys With ListObjectsV2

Answer : this is the best way to do that in my opinion: const AWS = require('aws-sdk'); const s3 = new AWS.S3(); const listAllKeys = (params, out = []) => new Promise((resolve, reject) => { s3.listObjectsV2(params).promise() .then(({Contents, IsTruncated, NextContinuationToken}) => { out.push(...Contents); !IsTruncated ? resolve(out) : resolve(listAllKeys(Object.assign(params, {ContinuationToken: NextContinuationToken}), out)); }) .catch(reject); }); listAllKeys({Bucket: 'bucket-name'}) .then(console.log) .catch(console.log); Here is the code to get the list of keys from a bucket. var params = { Bucket: 'bucket-name' }; var allKeys = []; listAllKeys(); function listAllKeys() { s3.listObjectsV2(params, function (err, data) { if (err) { console.log(err, err.stack); // an error occurred } else { var contents = data.Contents; contents.forEach(function (c...

AWS: Custom SSL Certificate Option Is Disabled In CloudFront, But I Created A SSL Certificate Using AWS Certificate Manager

Answer : Certificates that will be used with an Application Load Balancer (ELB/2.0) need to be created in ACM in the same region as the balancer. Certificates that will be used with CloudFront always need to be created in us-east-1. To use an ACM Certificate with Amazon CloudFront, you must request or import the certificate in the US East (N. Virginia) region. ACM Certificates in this region that are associated with a CloudFront distribution are distributed to all the geographic locations configured for that distribution. – http://docs.aws.amazon.com/acm/latest/userguide/acm-regions.html The reason for this is that CloudFront doesn't follow the regional boundary model in AWS. CloudFront edge locations are all over the globe, but are configured and managed out of us-east-1 -- think of it as CloudFront's home region. Once a distribution reaches the Deployed state, it is not operationally dependent on us-east-1, but during provisioning, everything originates from that ...

AccessDenied For ListObjectsV2 Operation For S3 Bucket

Answer : I'm not sure the accepted answer is actually acceptable , as it simply allows all operations on the bucket. Also the Sid is misleading... ;-) This AWS article mentions the required permissions for aws s3 sync . This is how a corresponding policy looks like: { "Version": "version_id", "Statement": [ { "Sid": "AllowBucketSync", "Effect": "Allow", "Action": [ "s3:GetObject", "s3:PutObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::BUCKET-NAME", "arn:aws:s3:::BUCKET-NAME/*" ] } ] } Try to update your bucket policy to: { "Version": "version_id", "Statement": [ { "Sid": "AllowPublicRead", "Effect": "Allow", ...

Check If File Exists In S3 Bucket

Answer : If you do aws s3 ls on the actual filename. If the filename exists, the exit code will be 0 and the filename will be displayed, otherwise, the exit code will not be 0: aws s3 ls s3://bucket/filname if [[ $? -ne 0 ]]; then echo "File does not exist" fi first answer is close but in cases where you use -e in shebang, the script will fail which you would most like not want. It is better to use wordcount. So you can use the below command: wordcount=`aws s3 ls s3://${S3_BUCKET_NAME}/${folder}/|grep $${file}|wc -c` echo wordcount=${wordcount} if [[ "${wordcount}" -eq 0 ]]; then do something else do something fi Try the following : aws s3api head-object --bucket ${S3_BUCKET} --key ${S3_KEY} It retrieves the metadata of the object without retrieving the object itself. READ(s3:GetObject) access is required. .