Add Configuration File Support For Deletion Settings

May 22, 2025 by ADMIN 53 views

Introduction

In the current implementation of the script, key parameters such as AWS region, bucket or snapshot identifiers, deletion threshold (e.g., age in days), or flags like dry_run may be hardcoded directly into the script or manually passed via CLI each time. This approach can lead to a rigid and inflexible system that is difficult to maintain and scale. To address this issue, we propose adding support for loading these settings from an external configuration file (e.g., .yaml or .json format).

Expected Behavior

The script should read settings from a config file (e.g., config.yaml or config.json). The settings may include:

AWS region
Resource names or IDs (e.g., S3 buckets, snapshot types)
Retention threshold (e.g., delete if older than X days)
Dry run toggle
Log level

The script should be able to fall back to default values or CLI overrides if the config file is missing or incomplete.

Suggested Implementation

To implement this feature, we suggest the following steps:

1. Add a function to parse a YAML/JSON file using Python’s `yaml` or `json` modules.

We can use the yaml or json modules to parse the configuration file. For example, we can use the yaml module to parse a YAML file:

import yaml

def load_config(file_path):
    with open(file_path, 'r') as file:
        config = yaml.safe_load(file)
    return config

2. Example `config.yaml` file:

Here is an example config.yaml file:

aws_region: us-east-1
retention_days: 30
dry_run: true
log_level: INFO
s3_bucket: example-bucket

3. Use the config values in your script to dynamically configure behavior.

We can use the config values to dynamically configure the behavior of the script. For example, we can use the aws_region value to set the AWS region:

config = load_config('config.yaml')
aws_region = config['aws_region']

4. Optionally support command-line argument `--config config.yaml` to specify the config file path.

We can add a command-line argument --config to specify the config file path. For example:

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--config', help='Path to the configuration file')
args = parser.parse_args()

if args.config:
    config_file = args.config
else:
    config_file = 'config.yaml'

Benefits

The proposed implementation has several benefits:

Improves separation of concerns (code vs. config): By separating the configuration from the code, we can make the code more modular and easier to maintain.
Makes it easier to reuse and test the script in different environments: With a configuration file, we can easily switch between different environments without modifying the code.
Enables version control of configuration without modifying code: We can version control the configuration file separately from the code, making it easier to track changes and collaborate with others.
Makes onboarding easier for other contributors or users: With a and well-documented configuration file, new contributors or users can easily understand how to configure the script.

Optional Enhancements

There are several optional enhancements that we can consider:

Validate config with a schema (e.g., using pydantic or cerberus): We can use a schema validation library to ensure that the configuration file is valid and conforms to a specific schema.
Print a warning for missing or invalid config entries: We can print a warning message if a config entry is missing or invalid, making it easier to identify and fix issues.
Support .env loading for secret keys if needed in future: We can use an environment variable loader to load secret keys from a .env file, making it easier to manage sensitive information.

Q: Why do we need a configuration file? Can't we just hardcode the settings?

A: Hardcoding settings can make the code rigid and inflexible, making it difficult to maintain and scale. A configuration file allows us to separate the configuration from the code, making it easier to modify and update settings without changing the code.

Q: What format should the configuration file be in?

A: The configuration file can be in either YAML or JSON format. Both formats are widely supported and easy to read and write.

Q: How do I specify the path to the configuration file?

A: You can specify the path to the configuration file using the --config command-line argument. For example: python script.py --config config.yaml

Q: What if the configuration file is missing or invalid?

A: If the configuration file is missing or invalid, the script will fall back to default values or CLI overrides. You can also add error handling to print a warning message or exit the script with an error code.

Q: Can I use environment variables to load secret keys?

A: Yes, you can use an environment variable loader to load secret keys from a .env file. This makes it easier to manage sensitive information and keep it separate from the configuration file.

Q: How do I validate the configuration file using a schema?

A: You can use a schema validation library such as pydantic or cerberus to validate the configuration file against a specific schema. This ensures that the configuration file is valid and conforms to the expected format.

Q: Can I use a different configuration file format, such as TOML or INI?

A: Yes, you can use a different configuration file format, such as TOML or INI, but you will need to modify the script to parse the file in the correct format.

Q: How do I handle configuration file changes?

A: You can use a version control system such as Git to track changes to the configuration file. This makes it easier to collaborate with others and keep track of changes.

Q: Can I use a configuration file in a production environment?

A: Yes, you can use a configuration file in a production environment. In fact, using a configuration file can make it easier to manage and update settings in a production environment.

Q: How do I troubleshoot configuration file issues?

A: You can use a combination of error handling, logging, and debugging tools to troubleshoot configuration file issues. This includes checking the configuration file for errors, verifying that the file is being loaded correctly, and checking the script's behavior in response to configuration file changes.