

Reference the recommended instance types for the algorithm you’ve selected in the SageMaker documentation.

Automatically stop buggy training jobs with SageMaker Debugger Optimize the resources of your training environment These rules monitor your workload and will automatically stop a training job as soon as it detects a bug (Figure 2), which helps you avoid unnecessary carbon emissions.įigure 2. It also provides built-in rules like LowGPUUtilization or Overfit. Use a debuggerĪ debugger like SageMaker Debugger can identify training problems like system bottlenecks, overfitting, saturated activation functions, and under-utilization of system resources. When regulations and legal aspects allow, choose Regions near Amazon renewable energy projects and Regions where the grid has low published carbon intensity to train your model. Model training Select sustainable AWS RegionsĪs mentioned in Part 1, select an AWS Region with sustainable energy sources. This way, you’re not maintaining compute infrastructure 24/7. For example, orchestration tools like AWS Step Functions or SageMaker Pipelines only provision resources when work needs to be done. This minimizes idle compute resources and thus limits the environmental impact of your training job.Īdopt a serverless architecture for your MLOps pipelines. Use the fully managed training process provided by SageMaker to automatically launch training instances and shut them down as soon as the training job is complete.
DISTILL WEB MONITOR BUGGY INSTALL
When building your model, use Lifecycle Configuration Scripts to automatically stop idle SageMaker Notebook instances. If you are using SageMaker Studio, install the auto-shutdown Jupyter extension to detect and stop idle resources. Start with small experiments, datasets, and compute resourcesĮxperiment with smaller datasets in your development notebook. This allows you to iterate quickly with limited carbon emission. You can achieve this with open-source compilers or Amazon SageMaker Training Compiler, which can speed up training of DL models by up to 50% by more efficiently using SageMaker GPU instances. Optimize your deep learning models to accelerate trainingĬompile your DL models from their high-level language representation to hardware-optimized instructions to reduce training time.

Incremental Training : Use artifacts from an existing model on an expanded dataset to train a new model.

For example, a model trained on ImageNet (14 million images) can generalize with other datasets.
