In this post, we will see How to Install Python Packages on AWS EMR Notebooks. AWS EMR Notebooks is based on Jupyter notebook. Note the below points with regards to the additional ad-hoc packages installed -
%%info
You can modify the config as per your preference -
%%configure \-f
{
"conf":{
"executorMemory":"4G",
"spark.dynamicAllocation.enabled":"false"
}
}
sc.install\_pypi\_package("boto3")
sc.install\_pypi\_package("numpy==x.y.z") \# Install numpy version x.y.z
sc.install\_pypi\_package("numpy") # Install numpy latest version
sc.install\_pypi\_package("<package\_name\_with\_version>", "https://pypi.org/simple") \# Install from specific PyPI repo
sc.install\_pypi\_package("pandas", "https://pypi.org/simple")
sc.list\_packages()
%%local
conda list
sc.uninstall\_package('<package\_name>')
Hope this helps.
aws jupyter notebook install package ,aws emr install python packages ,aws emr bootstrap install python packages ,install\_pypi\_package pyspark ,emr notebook pip install ,install python package on spark cluster ,sc.install\_pypi\_package upgrade ,spark.pyspark.virtualenv.enabled is set to true emr ,How do I install Python EMR packages? ,Can we run Python on EMR? ,How do I install Python packages in SageMaker? ,How do I install SageMaker modules? ,