How to create EMR cluster using Python boto3?

0 votes
Can someone help me with the python code to create a EMR Cluster? Any help is appreciated.
Feb 26, 2019 in AWS by Reshma Bhattacharya
4,299 views

1 answer to this question.

0 votes

The python boto3 code for creating a EMR cluster is as follows:-

import boto3

connection = boto3.client('emr',region_name='us-east-1',aws_access_key_id='Access Key',aws_secret_access_key='Secret Key',)

cluster_id = connection.run_job_flow(Name='test_emr_job_boto3',LogUri='s3://priyaj',ReleaseLabel='emr-5.18.0',
    Applications=[
        {
            'Name': 'Spark'
        },
    ],
    Instances={
        'InstanceGroups': [
            {
                'Name': "Master",
                'Market': 'ON_DEMAND',
                'InstanceRole': 'MASTER',
                'InstanceType': 'm1.xlarge',
                'InstanceCount': 1,
            },
            {
                'Name': "Slave",
                'Market': 'ON_DEMAND',
                'InstanceRole': 'CORE',
                'InstanceType': 'm1.xlarge',
                'InstanceCount': 1,
            }
        ],
        'Ec2KeyName': 'Your key name',
        'KeepJobFlowAliveWhenNoSteps': True,
        'TerminationProtected': False,
        'Ec2SubnetId': 'subnet-id',
    },
    Steps=[
        {
            'Name': 'file-copy-step',   
                    'ActionOnFailure': 'CONTINUE',
                    'HadoopJarStep': {
                        'Jar': 's3://Snapshot-jar-with-dependencies.jar',
                        'Args': ['test.xml', 'emr-test', 'kula-emr-test-2']
                    }
        }
    ],
    VisibleToAllUsers=True,
    JobFlowRole='EMR_EC2_DefaultRole',
    ServiceRole='EMR_DefaultRole',
    Tags=[
        {
            'Key': 'tag_name_1',
            'Value': 'tab_value_1',
        },
        {
            'Key': 'tag_name_2',
            'Value': 'tag_value_2',
        },
    ],
)

This way you can create a EMR cluster with 1 master and 1 slave node.

answered Feb 27, 2019 by Priyaj
• 57,640 points
Hi,

Is there a way to set up a secondary master node and switch over in case of failure? Please can you suggest a alternative. Kindly help
I will try and let you know whether it is possible or not.
Any way to integrate the lambda function and how can we add an extra step?
You can create one lambda function and import your code and try to run your code.

Related Questions In AWS

0 votes
1 answer

How to create a EMR Cluster using Java AWS SDK?

The Java code for creating an EMR ...READ MORE

answered Feb 27, 2019 in AWS by Priyaj
• 57,640 points
464 views
0 votes
1 answer

How to launch and configure an EMR cluster using boto

Boto and the underlying EMR API is ...READ MORE

answered Sep 12, 2018 in AWS by Priyaj
• 57,640 points
2,649 views
0 votes
1 answer
+1 vote
2 answers
0 votes
1 answer
0 votes
1 answer

How to create EMR cluster using AWS CLI?

The command to create EMR cluster using ...READ MORE

answered Feb 27, 2019 in AWS by Priyaj
• 57,640 points
588 views