AWS Resource Discovery Toolkit
Shell-based toolkit for discovering and auditing unmanaged AWS resources to facilitate Terraform imports and IaC adoption
Tech Stack
Table of Contents
Problem Statement
One of the most common challenges in cloud security and DevSecOps is infrastructure drift—the gap between what’s defined in Infrastructure as Code (IaC) and what actually exists in your AWS accounts. Resources get created manually through the console, via CloudFormation templates that aren’t version-controlled, or by team members experimenting with services. Over time, this creates several critical problems:
Security Risks: Unmanaged resources bypass security policies, compliance checks, and audit trails. You can’t secure what you don’t know exists.
Cost Waste: Orphaned EC2 instances, forgotten S3 buckets, and abandoned test environments accumulate charges. Without visibility, optimization is impossible.
Compliance Gaps: Security audits and compliance frameworks (SOC 2, ISO 27001, PCI-DSS) require complete infrastructure inventories. Undocumented resources create audit failures.
Team Inefficiency: When developers need to understand the infrastructure, they have to manually click through the AWS console across multiple accounts and regions—a time-consuming, error-prone process.
I needed a lightweight, scriptable solution to discover all AWS resources in an account, identify what’s not managed by Terraform, and generate human-readable reports that security teams and auditors could actually use. Commercial tools like CloudHealth or CloudCheckr are expensive and overkill for this specific use case. I wanted something fast, transparent, and customizable.
Solution Architecture
The AWS Resource Discovery Toolkit is a collection of modular bash scripts that use the AWS CLI to scan accounts for specific resource types. Each script can run independently or be orchestrated by a master aggregation script that produces comprehensive reports.
Discovery Workflow
graph TB
A[User Runs Script] --> B{Execution Mode}
B -->|Individual Script| C[discover-iam-roles.sh]
B -->|Individual Script| D[discover-ec2-instances.sh]
B -->|Individual Script| E[discover-s3-buckets.sh]
B -->|Master Script| F[discover-all.sh]
F --> C
F --> D
F --> E
F --> G[discover-oidc-providers.sh]
F --> H[discover-iam-policies.sh]
F --> I[discover-vpcs.sh]
C --> J{Output Format}
D --> J
E --> J
G --> J
H --> J
I --> J
J -->|JSON| K[Raw AWS CLI Output]
J -->|Table| L[Console-Friendly Tables]
J -->|Human| M[Detailed Text Reports]
K --> N[Timestamped Output Files]
L --> N
M --> N
F --> O[Consolidated Report]
F --> P[Summary Statistics]
style F fill:#4a6741,color:#fff
style O fill:#3a5231,color:#fff
Key Components
-
Individual Discovery Scripts (6 resource types):
- IAM Roles: Lists non-AWS managed roles with attached policies
- OIDC Providers: Discovers GitLab/GitHub identity providers for CI/CD
- IAM Policies: Enumerates customer-managed policies
- EC2 Instances: Shows instance state, type, IPs, and tags
- S3 Buckets: Reports encryption, versioning, and public access settings
- VPCs: Lists VPCs with CIDR blocks and subnet counts
-
Master Aggregation Script (
discover-all.sh):- Executes all individual scripts sequentially
- Generates consolidated reports with resource counts
- Creates summary statistics (total resources by type)
- Produces audit-ready documentation
-
Output Formatters:
- JSON: Raw AWS CLI output for programmatic processing
- Table: Quick console view for rapid assessment
- Human: Detailed text reports with metadata for audits
Technical Implementation
Script Architecture
Each discovery script follows a consistent pattern:
#!/bin/bash
set -e # Exit on error
FORMAT="${1:-json}" # Default to JSON output
OUTPUT_DIR="../output"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
case "$FORMAT" in
json) # Raw AWS CLI JSON
table) # Tabular console output
human) # Detailed human-readable report
esac
Design principles:
- Fail-fast:
set -eensures scripts exit on any AWS CLI error - Timestamped outputs: Every report includes timestamp for audit trails
- Format flexibility: Choose output based on use case (automation vs. review)
- No external dependencies: Only requires AWS CLI and jq (standard tools)
IAM Role Discovery Example
The discover-iam-roles.sh script demonstrates the toolkit’s approach:
- List all roles via
aws iam list-roles - Filter out AWS-managed roles (those starting with “AWS”)
- For each custom role:
- Retrieve role details (trust policy, description, creation date)
- List attached managed policies
- Report last usage timestamp
- Format output based on user preference
Human-readable output example:
Role: devops-operator
Path: /
Created: 2025-11-22T00:48:33+00:00
Description: Role for GitLab CI/CD pipeline
Attached Policies:
- AdministratorAccess
Last Used: 2025-11-22 in us-west-1
This gives security teams immediate visibility into who has access and when it was last used—critical for access reviews.
S3 Bucket Security Audit
The S3 discovery script goes beyond just listing buckets—it checks security configurations:
# For each bucket, check:
VERSIONING=$(aws s3api get-bucket-versioning --bucket "$BUCKET")
ENCRYPTION=$(aws s3api get-bucket-encryption --bucket "$BUCKET")
PUBLIC_BLOCK=$(aws s3api get-public-access-block --bucket "$BUCKET")
This identifies:
- Buckets without encryption (compliance violation)
- Buckets without versioning (data loss risk)
- Buckets allowing public access (security risk)
Master Aggregation Report
The discover-all.sh script orchestrates all discoveries and generates:
-
Summary Report (
SUMMARY.txt):Account ID: 123456789012 Region: us-west-1 Scan Date: 2025-11-24 14:30:22 Resource Counts: IAM Roles: 12 EC2 Instances: 4 S3 Buckets: 3 -
Consolidated Human-Readable Report: All detailed reports combined
-
Individual JSON/Table Files: For programmatic access
-
Discovery Log: Complete execution log for debugging
Challenges & Solutions
Challenge 1: Handling AWS Pagination
Problem: AWS CLI returns paginated results for large resource lists. Early versions of the scripts only captured the first page, missing resources.
Solution: Use --output json with jq for complete results:
aws iam list-roles --output json | jq '.Roles[]'
AWS CLI automatically handles pagination when outputting JSON, but not for table format.
Lesson Learned: Always test with accounts that have >100 resources of a type to catch pagination issues.
Challenge 2: Cross-Region Discovery
Problem: Resources are region-specific, but manually switching regions for each scan is tedious.
Solution: Added region loop capability:
for REGION in us-west-1 us-east-1 eu-west-1; do
export AWS_DEFAULT_REGION=$REGION
./discover-all.sh human "prod-$REGION"
done
Lesson Learned: Build scriptability into tools from the start—loops and automation come naturally with shell scripts.
Challenge 3: Rate Limiting on Large Accounts
Problem: Scanning accounts with thousands of resources triggered AWS API rate limits.
Solution: Added error handling and retry logic:
aws iam list-roles 2>&1 || echo "Rate limit exceeded, waiting..."
Future enhancement: exponential backoff for retries.
Lesson Learned: AWS rate limits are per-account and per-region. For enterprise accounts, consider parallel scanning across regions instead of sequential.
Challenge 4: Sensitive Data in Reports
Problem: Discovery reports could inadvertently expose sensitive information (S3 bucket names, IAM policy details).
Solution:
- Added
.gitignoreto excludeoutput/directory - Documented security considerations in README
- Recommended encrypting reports for compliance audits
Lesson Learned: Security tools need security controls. Always consider what data the tool itself exposes.
Results & Metrics
Immediate Impact
Time Savings: Manual AWS console review of 4 accounts across 6 resource types: ~3-4 hours Automated scan: ~5 minutes
Discovery Rate: First production run discovered:
- 12 IAM roles not in Terraform
- 4 orphaned EC2 instances (stopped, no tags)
- 1 OIDC provider manually created
- 2 S3 buckets without encryption
Cost Optimization: Identified $47/month in unnecessary costs from stopped instances and unused buckets.
Use Cases Enabled
- Terraform Import Planning: Before importing resources, run discovery to see what exists
- Security Audits: Generate compliance reports showing all IAM roles and policies
- Multi-Account Governance: Scan all AWS Organization accounts nightly for drift detection
- Onboarding: New team members can understand infrastructure in minutes vs. days
Adoption Potential
Target Users:
- DevSecOps engineers managing multi-account AWS environments
- Security teams conducting access reviews
- Organizations migrating to Infrastructure as Code
- Consultants auditing client AWS accounts
Open Source Value: Published on GitHub for community use—fills gap between expensive commercial tools and manual console work.
Key Takeaways
-
Shell scripts remain relevant: For AWS automation, bash + AWS CLI is often faster to develop than Python or Go, with zero dependencies.
-
Multi-format output is essential: JSON for automation, tables for quick checks, human-readable for documentation. Supporting all three multiplies the tool’s value.
-
Security discovery is ongoing: This isn’t a one-time audit tool—it’s meant for continuous scanning. Integrating with CI/CD or cron jobs makes it a living inventory.
-
Documentation drives adoption: The comprehensive README with use cases, troubleshooting, and examples is what turns a useful script into a reusable tool.
-
Start modular, then aggregate: Building 6 focused scripts first, then combining them, was easier than building one monolithic tool. Users can pick what they need.
Future Enhancements
Short-term (v1.1)
- Add Lambda functions discovery - Serverless resources are often forgotten
- DynamoDB tables - Document NoSQL infrastructure
- CloudFormation stacks - Identify IaC already in use
- Security Groups - Critical for network security audits
- Export to CSV - For importing into spreadsheets or databases
Medium-term (v2.0)
- Terraform code generation - Auto-generate
terraform importcommands and basic.tffiles - Drift detection mode - Compare discovered resources against Terraform state
- Multi-account scanning - Use AWS Organizations to scan all accounts automatically
- Web UI - Simple dashboard for visualizing discovered resources
Long-term (v3.0+)
- Real-time change detection - Use CloudTrail/EventBridge to detect new unmanaged resources
- Policy-as-code validation - Check if discovered resources comply with organizational policies
- Integration with Terraform Cloud - Push discoveries to TFC for automated import workflows
- Slack/email notifications - Alert when new unmanaged resources are detected
Resources
- Repository: github.com/hmbldv/aws-scripts
- Documentation: README.md with usage examples
- AWS CLI Reference: AWS Documentation
- Related Tool: Terraformer - More comprehensive IaC generation
This project demonstrates proficiency in: AWS security auditing, shell scripting, Infrastructure as Code migration, DevSecOps automation, compliance reporting, and building reusable tooling for cloud environments.