# Security Best Practices Standards for Fly.io
This document outlines security best practices for developing and deploying applications on the Fly.io platform. Adhering to these standards will help protect your applications and data from common vulnerabilities and ensure a secure and reliable deployment.
## 1. Secure Configuration and Secrets Management
### 1.1. Secure Secrets Storage
**Standard:** Never hardcode secrets directly in your application code, Dockerfiles, or configuration files. Use Fly.io's built-in secrets management.
**Why:** Hardcoding secrets exposes them to anyone with access to your codebase or container images. Fly.io secrets are encrypted at rest and in transit, minimizing the risk of exposure.
**Do This:**
* Use "flyctl secrets" to manage secrets.
"""bash
flyctl secrets set DATABASE_URL="postgres://user:password@host:port/database"
flyctl secrets set API_KEY="your_super_secret_api_key"
"""
* Access secrets in your application code through environment variables.
"""python
# Python example
import os
database_url = os.environ.get("DATABASE_URL")
api_key = os.environ.get("API_KEY")
if not database_url or not api_key:
raise ValueError("Required secrets are not set.")
# Use database_url and api_key to connect to your database and make API calls
"""
**Don't Do This:**
* Hardcode secrets in your code:
"""python
# Python example - BAD PRACTICE
database_url = "postgres://user:password@host:port/database"
api_key = "your_super_secret_api_key"
"""
* Store secrets in version control.
* Expose secrets in logs.
**Anti-Pattern:** Using ".env" files in production. While convenient for local development, they are not secure for production deployments and can easily be accidentally committed to source control or exposed.
### 1.2. Environment-Specific Configuration
**Standard:** Separate configuration for development, staging, and production environments.
**Why:** Using the same configuration across environments can lead to misconfiguration and security vulnerabilities. For example, using production API keys in a development environment could expose sensitive data.
**Do This:**
* Utilize Fly.io's built-in support for environment variables to specify configurations.
* Use separate Fly.io apps for each environment (e.g., "myapp-dev", "myapp-staging", "myapp-prod").
* Create and manage environment-specific secrets using "flyctl secrets".
"""bash
# Set secrets for the production app
flyctl secrets set --app myapp-prod DATABASE_URL="..." API_KEY="..."
# Set secrets for the staging app
flyctl secrets set --app myapp-staging DATABASE_URL="..." API_KEY="..."
"""
**Don't Do This:**
* Use the same secrets across all environments.
* Rely on manual configuration changes between environments.
**Code Example:**
"""toml
# fly.toml - Example configuration for defining specific build arguments and env vars
[build]
builder = "dockerfile"
# Pass in build-time variables that depend on target environment.
# For example, NODE_ENV = "production" when building for production.
build-target = "release" #example
[env]
PORT = "8080"
[deploy]
release_command = "/app/migrate_db"
"""
### 1.3. Principle of Least Privilege
**Standard:** Grant the minimum necessary privileges to users, applications, and services.
**Why:** Limiting access reduces the potential impact of security breaches. If a compromised account or service has limited privileges, the attacker's ability to cause damage is significantly reduced.
**Do This:**
* Use Fly.io's RBAC (Role-Based Access Control) features documented here: (Fly.io currently offers limited RBAC).
* Ensure applications running within VMs only have the permissions they need, using "USER" directives in Dockerfiles.
* Configure firewall rules to restrict network access to only necessary ports and services.
**Don't Do This:**
* Run applications as root unless absolutely necessary.
* Grant broad permissions to services or users without a specific justification.
**Code Example (Dockerfile):**
"""dockerfile
FROM ubuntu:latest
# Update and install necessary packages
RUN apt-get update && apt-get install -y --no-install-recommends \
python3 python3-pip
# Create a non-root user
RUN useradd -m -s /bin/bash appuser
# Set the working directory
WORKDIR /app
# Copy application files
COPY . .
# Install Python dependencies
RUN pip3 install -r requirements.txt --user
# Change ownership of the application directory to the non-root user
RUN chown -R appuser:appuser /app
# Switch to the non-root user
USER appuser
# Command to run the application
CMD ["python3", "app.py"]
"""
### 1.4. Regular Security Audits and Updates
**Standard:** Regularly review your application code, dependencies, and infrastructure for security vulnerabilities. Keep your software up-to-date with the latest security patches.
**Why:** New vulnerabilities are discovered regularly. Staying up-to-date with security patches helps prevent exploits. Regular audits can identify potential vulnerabilities early.
**Do This:**
* Use automated vulnerability scanning tools (e.g., Snyk, Trivy) to scan your dependencies and container images.
* Subscribe to security mailing lists and advisories for the technologies you use (e.g., Python, Node.js, PostgreSQL).
* Regularly update your base images in your Dockerfiles.
* Implement a process for reviewing and addressing security vulnerabilities promptly.
**Don't Do This:**
* Ignore security alerts or vulnerabilities.
* Use outdated versions of software without security patches.
**Code Example (using Snyk in a CI/CD pipeline):**
"""yaml
# .github/workflows/security.yml - Example GitHub Actions workflow for running Snyk tests.
name: Security Scan
on:
push:
branches: [ main ] # or whatever your main branch is
pull_request:
branches: [ main ]
jobs:
snyk:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Snyk to check for vulnerabilities
uses: snyk/actions/python@master # Or javascript etc, adjust as needed
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
args: --file=requirements.txt --severity-threshold=high
"""
## 2. Securing Network Communications
### 2.1. HTTPS for All Traffic
**Standard:** Use HTTPS for all communication between clients and your Fly.io application.
**Why:** HTTPS encrypts data in transit, preventing eavesdropping and man-in-the-middle attacks.
**Do This:**
* Allow fly.io to automatically provision TLS certificates for your application. Fly.io automatically provides free TLS certificates through Let's Encrypt.
"""bash
flyctl certs show your-app-name.fly.dev
"""
* Ensure your application is configured to redirect HTTP traffic to HTTPS.
**Don't Do This:**
* Use plain HTTP for sensitive data.
* Disable TLS encryption.
**Code Example (configuring redirection in a web server):**
"""nginx
# nginx configuration to redirect HTTP to HTTPS
server {
listen 80;
server_name your-app-name.fly.dev;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl;
server_name your-app-name.fly.dev;
# SSL certificate configuration
ssl_certificate /etc/letsencrypt/live/your-app-name.fly.dev/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/your-app-name.fly.dev/privkey.pem;
# ... other configurations ...
}
"""
### 2.2. Firewall Configuration
**Standard:** Configure firewall rules (e.g., using iptables or UFW) to limit network access to only necessary ports and services.
**Why:** Firewalls prevent unauthorized access to your application and reduce the attack surface.
**Do This:**
* Use Fly.io's private networking to isolate apps.
* Use a tool like "ufw" to manage firewall rules inside of your VM.
**Don't Do This:**
* Leave unnecessary ports open to the public internet.
* Disable the firewall.
**Code Example (using "ufw" to allow only SSH and HTTP/HTTPS traffic):**
"""bash
# Allow SSH access
ufw allow OpenSSH
# Allow HTTP traffic
ufw allow 80
# Allow HTTPS traffic
ufw allow 443
# Enable the firewall
ufw enable
# Check the firewall status
ufw status
"""
### 2.3. Mutual TLS (mTLS)
**Standard:** Use mTLS for secure communication between services within your Fly.io private network.
**Why:** mTLS provides strong authentication and encryption by requiring both the client and server to present valid certificates.
**Do This:**
* Generate client and server certificates using a tool like OpenSSL.
* Configure your services to require client certificates during TLS handshakes.
* Distribute client certificates securely.
**Don't Do This:**
* Use self-signed certificates in production without proper validation.
* Store private keys in insecure locations.
### 2.4. Monitoring and Logging
**Standard:** Implement comprehensive logging and monitoring to detect and respond to security incidents.
**Why:** Logging and monitoring provide visibility into your application's behavior, allowing you to identify suspicious activity and security vulnerabilities.
**Do This:**
* Use a centralized logging system to collect logs from all your Fly.io applications and services (e.g., Grafana Loki).
* Monitor key security metrics, such as authentication failures, API request rates, and error rates.
**Don't Do This:**
* Disable logging.
* Store sensitive data in logs without proper redaction.
* Ignore suspicious activity detected by monitoring systems.
## 3. Application Security
### 3.1. Input Validation and Output Encoding
**Standard:** Validate all input data from clients and other services. Encode output data to prevent cross-site scripting (XSS) and other injection attacks.
**Why:** Input validation prevents attackers from injecting malicious code or data into your application. Output encoding prevents injected code from being executed in the client's browser.
**Do This:**
* Use server-side validation to verify the format, type, and length of all input data.
* Use a templating engine with automatic output encoding (e.g., Jinja2 for Python, Handlebars for JavaScript).
**Don't Do This:**
* Trust client-side validation alone.
* Display raw user input without encoding.
**Code Example (Python using Flask and Jinja2):**
"""python
# Flask example with Jinja2 templating engine
from flask import Flask, request, render_template
import bleach
app = Flask(__name__)
@app.route('/', methods=['GET', 'POST'])
def index():
if request.method == 'POST':
# Validate the input
name = request.form.get('name')
if not name or len(name) > 100:
return render_template('index.html', error='Invalid name')
# Sanitize HTML input using bleach
message = bleach.clean(request.form.get('message'))
# Render the template with the sanitized message
return render_template('index.html', name=name, message=message)
return render_template('index.html')
#index.html Jinja2 template
{% if error %}
<p>{{ error }}</p>
{% endif %}
Name:<br>
<br><br>
Message:<br>
<br><br>
{% if name and message %}
Hello, {{ name }}!
<p>Your message: {{ message }}</p>
{% endif %}
"""
### 3.2. Cross-Site Request Forgery (CSRF) Protection
**Standard:** Implement CSRF protection to prevent attackers from forging requests on behalf of authenticated users.
**Why:** CSRF attacks can allow attackers to perform unauthorized actions on behalf of logged-in users.
**Do This:**
* Use a CSRF token that is unique to each user session.
* Include the CSRF token in all forms and AJAX requests.
* Validate the CSRF token on the server before processing the request.
**Don't Do This:**
* Disable CSRF protection.
* Use the same CSRF token for all users.
**Code Example (Python using Flask and WTForms):**
"""python
# Python using Flask and WTForms
from flask import Flask, render_template, session, redirect, url_for
from flask_wtf import FlaskForm, CSRFProtect
from wtforms import StringField, SubmitField
from wtforms.validators import DataRequired
app = Flask(__name__)
app.config['SECRET_KEY'] = 'your_secret_key' # Change this to a strong random key
csrf = CSRFProtect(app)
class MyForm(FlaskForm):
name = StringField('Name', validators=[DataRequired()])
submit = SubmitField('Submit')
@app.route('/', methods=['GET', 'POST'])
def index():
form = MyForm()
if form.validate_on_submit():
session['name'] = form.name.data
return redirect(url_for('success'))
return render_template('index.html', form=form)
@app.route('/success')
def success():
if 'name' in session:
name = session['name']
return render_template('success.html', name=name)
else:
return redirect(url_for('index'))
if __name__ == '__main__':
app.run(debug=True)
"""
### 3.3. Authentication and Authorization
**Standard:** Implement strong authentication and authorization mechanisms to control access to your application.
**Why:** Authentication verifies the identity of users, while authorization determines what resources they are allowed to access.
**Do This:**
* Use strong password policies (e.g., minimum length, complexity requirements).
* Implement multi-factor authentication (MFA) for privileged accounts.
* Use a role-based access control (RBAC) system to manage user permissions.
* Store passwords securely using a strong hashing algorithm (e.g., bcrypt, Argon2).
**Don't Do This:**
* Store passwords in plain text.
* Use weak or default passwords.
* Grant excessive permissions to users.
### 3.4. Dependency Management
**Standard:** Keep your application's dependencies up-to-date and use tools to detect and prevent vulnerable dependencies.
**Why:** Vulnerabilities in dependencies can be exploited to compromise your application.
**Do This:**
* Use a dependency management tool (e.g., pip for Python, npm for Node.js) to manage your application's dependencies.
* Regularly update your dependencies to the latest versions.
* Use automated vulnerability scanning tools (e.g., Snyk, OWASP Dependency-Check).
**Don't Do This:**
* Use outdated dependencies without security patches.
* Ignore security alerts from dependency scanning tools.
### 3.5. Error Handling and Logging
**Standard:** Handle errors gracefully and log sufficient information to diagnose problems.
**Why:** Proper error handling prevents sensitive information from being exposed to users. Logging provides valuable information for debugging and security incident response.
**Do This:**
* Implement a global error handler to catch unexpected exceptions.
* Log errors with sufficient detail to identify the root cause.
* Redact sensitive information (e.g., passwords, API keys) from logs.
* Use structured logging to make logs easier to query and analyze.
**Don't Do This:**
* Expose stack traces or other sensitive information to users in error messages.
* Log sensitive data in plain text.
* Ignore errors or warnings.
## 4. Dockerfile and Image Security
### 4.1. Minimal Base Images
**Standard:** Use minimal base images for your Docker containers to reduce the attack surface.
**Why:** Smaller images contain fewer dependencies, reducing the number of potential vulnerabilities.
**Do This:**
* Use lightweight base images like Alpine Linux or distroless images.
**Don't Do This:**
* Use full-featured base images like Ubuntu or Debian unless necessary.
**Code Example (using Alpine Linux as a base image):**
"""dockerfile
FROM python:3.9-alpine
# Install dependencies
# Copy application files
# Set the working directory
# Command to run the application
"""
### 4.2. Multi-Stage Builds
**Standard:** Use multi-stage builds to separate build-time dependencies from runtime dependencies.
**Why:** Multi-stage builds allow you to include build tools and dependencies in a temporary build environment, and then copy only the necessary artifacts to the final image.
**Do This:**
* Use separate "FROM" instructions for the build and runtime stages.
* Copy only the necessary files and dependencies from the build stage to the runtime stage.
**Don't Do This:**
* Include unnecessary build tools or dependencies in the final image.
**Code Example (using multi-stage build):**
"""dockerfile
# Build Stage
FROM golang:1.21 AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . ./
RUN go build -o /app/mybinary
# Production Stage
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/mybinary /app/mybinary
CMD ["/app/mybinary"]
"""
### 4.3. Image Scanning
**Standard:** Scan your Docker images for vulnerabilities before deploying them to Fly.io.
**Why:** Image scanning identifies potential vulnerabilities in your container images before they can be exploited.
**Do This:**
* Use a container image scanning tool (e.g., Trivy, Clair, Anchore).
* Integrate image scanning into your CI/CD pipeline.
* Address vulnerabilities identified by the scanner before deploying the image.
This comprehensively describes Security Best Practices on Fly.io. Adherence will increase security for development teams and should be enforced in CI/CD.
danielsogl
Created Mar 6, 2025
This guide explains how to effectively use .clinerules
with Cline, the AI-powered coding assistant.
The .clinerules
file is a powerful configuration file that helps Cline understand your project's requirements, coding standards, and constraints. When placed in your project's root directory, it automatically guides Cline's behavior and ensures consistency across your codebase.
Place the .clinerules
file in your project's root directory. Cline automatically detects and follows these rules for all files within the project.
# Project Overview project: name: 'Your Project Name' description: 'Brief project description' stack: - technology: 'Framework/Language' version: 'X.Y.Z' - technology: 'Database' version: 'X.Y.Z'
# Code Standards standards: style: - 'Use consistent indentation (2 spaces)' - 'Follow language-specific naming conventions' documentation: - 'Include JSDoc comments for all functions' - 'Maintain up-to-date README files' testing: - 'Write unit tests for all new features' - 'Maintain minimum 80% code coverage'
# Security Guidelines security: authentication: - 'Implement proper token validation' - 'Use environment variables for secrets' dataProtection: - 'Sanitize all user inputs' - 'Implement proper error handling'
Be Specific
Maintain Organization
Regular Updates
# Common Patterns Example patterns: components: - pattern: 'Use functional components by default' - pattern: 'Implement error boundaries for component trees' stateManagement: - pattern: 'Use React Query for server state' - pattern: 'Implement proper loading states'
Commit the Rules
.clinerules
in version controlTeam Collaboration
Rules Not Being Applied
Conflicting Rules
Performance Considerations
# Basic .clinerules Example project: name: 'Web Application' type: 'Next.js Frontend' standards: - 'Use TypeScript for all new code' - 'Follow React best practices' - 'Implement proper error handling' testing: unit: - 'Jest for unit tests' - 'React Testing Library for components' e2e: - 'Cypress for end-to-end testing' documentation: required: - 'README.md in each major directory' - 'JSDoc comments for public APIs' - 'Changelog updates for all changes'
# Advanced .clinerules Example project: name: 'Enterprise Application' compliance: - 'GDPR requirements' - 'WCAG 2.1 AA accessibility' architecture: patterns: - 'Clean Architecture principles' - 'Domain-Driven Design concepts' security: requirements: - 'OAuth 2.0 authentication' - 'Rate limiting on all APIs' - 'Input validation with Zod'
# Component Design Standards for Fly.io This document outlines the component design standards for applications deployed on Fly.io. Adhering to these guidelines will promote maintainability, reusability, performance, and security in your Fly.io applications. ## 1. Introduction to Component Design in Fly.io Component design in Fly.io focuses on creating modular, independent, and reusable parts of an application that are easy to develop, test, and maintain. Given Fly.io's geographically distributed nature, well-designed components also contribute to improved latency and resilience. In this context, "component" is a logical grouping of functionalities, often corresponding to modules, classes, or services. * **Goal:** Build robust, scalable, and maintainable applications on Fly.io. * **Focus:** Modularity, reusability, performance, and security. ## 2. Architectural Considerations ### 2.1 Microservices vs. Monolith with Modules Fly.io supports both microservice and monolithic architectures (with a modular design). The choice depends on the application's complexity and scalability needs. * **Microservices:** Independent, deployable services communicating over the network. Suited for complex applications requiring independent scaling and fault isolation. * **Monolith with Modules:** A single application with clear module boundaries internally. Suitable for smaller applications or when operational overhead of microservices is a concern. **Do This:** * For large applications, decompose into loosely coupled microservices, each handling a specific domain. * For smaller projects, leverage a modular approach within a monolithic application. **Don't Do This:** * Create tightly coupled microservices that lead to a distributed monolith. * Build a monolithic application with no modularity, resulting in unmaintainable code. **Why:** Microservices offer better scalability and fault isolation, while modular monoliths simplify development and deployment for smaller applications. Proper modularity reduces dependencies which helps isolate deployment errors and simplifies development. **Example (Microservice):** """dockerfile # Dockerfile for a user service FROM python:3.11-slim-bookworm WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . CMD ["python", "user_service.py"] """ **Example (Monolith with Modules):** """python # app.py from user_module import User from product_module import Product # Use the modules user = User(name="John Doe") product = Product(name="Awesome Product") print(f"User: {user.name}, Product: {product.name}") """ ### 2.2 Location Awareness on Fly.io Fly.io's ability to run applications close to users means components should be designed with location awareness in mind. * **Data locality:** Store and process data in the region closest to the users. * **Regional deployments:** Deploy specific components to particular Fly.io regions. **Do This:** * Use Fly.io's region routing features to direct traffic to the nearest instance of a component. * Implement caching strategies to minimize cross-region data access. **Don't Do This:** * Assume all users are geographically close to a single server. * Ignore latency implications of cross-region data access. **Why:** Minimizing latency improves the user experience and reduces bandwidth costs. **Example (Fly.io Region Routing with "fly.toml"):** """toml app = "my-app" primary_region = "iad" # Initial region [http_service] internal_port = 8080 force_https = true auto_stop_machines = true auto_start_machines = true min_machines_running = 1 [[http_service.route]] service = "my-app-eu" # Example: Send requests from Europe to europe VMs path = "/api/europe" [deploy] regions = ["iad", "fra", "syd"] # Regions used for deployment """ ### 2.3 Fault Tolerance & Resilience Fly.io's distributed nature requires components to be fault-tolerant. * **Replication:** Run multiple instances of each component across different regions. * **Circuit Breakers:** Implement circuit breaker pattern to prevent cascading failures. * **Health checks:** Use Fly.io's health checks to monitor component availability and automatically restart failed instances. **Do This:** * Configure health checks for all critical components in your "fly.toml". * Use retry mechanisms with exponential backoff for communication between components. * Implement circuit breakers to isolate failing components. **Don't Do This:** * Rely on a single instance of a component without redundancy. * Allow one failing component to bring down the entire application. **Why:** Redundancy and fault isolation ensures higher availability and a better user experience. **Example (Fly.io Health Check in "fly.toml"):** """toml app = "my-app" primary_region = "iad" [http_service] internal_port = 8080 force_https = true auto_stop_machines = true auto_start_machines = true min_machines_running = 1 [http_service.checks] path = "/healthz" # endpoint of your healthcheck interval = "10s" timeout = "5s" """ ## 3. Coding Standards for Components ### 3.1 Single Responsibility Principle (SRP) Each component should have one, and only one, reason to change. **Do This:** * Design classes and modules with a clear, focused purpose. * Refactor large components into smaller, more manageable units. **Don't Do This:** * Create "god classes" or modules that handle multiple unrelated tasks. **Why:** Makes components easier to understand, test, and maintain. **Example (Python SRP):** """python # Good: Separate classes for User and Email class User: def __init__(self, name, email): self.name = name self.email = email class EmailService: def send_welcome_email(self, user): print(f"Sending welcome email to {user.email}") # Bad: User class handles both user data and email sending class UserWithEmail: def __init__(self, name, email): self.name = name self.email = email def send_welcome_email(self): #Violates SRP: User shouldn't handle email print(f"Sending welcome email to {self.email}") user = User("John Doe", "john@example.com") email_service = EmailService() email_service.send_welcome_email(user) """ ### 3.2 Open/Closed Principle (OCP) Components should be open for extension but closed for modification. **Do This:** * Use inheritance or composition to add new functionality without modifying existing code. * Favor interfaces and abstract classes to decouple components. **Don't Do This:** * Directly modify existing code to add new features, risking regressions. **Why:** Reduces the risk of introducing bugs when adding new features. **Example (Python OCP):** """python # Good: Using Strategy Pattern from abc import ABC, abstractmethod class PaymentStrategy(ABC): @abstractmethod def pay(self, amount): pass class CreditCardPayment(PaymentStrategy): def pay(self, amount): print(f"Paying {amount} with credit card") class PayPalPayment(PaymentStrategy): def pay(self, amount): print(f"Paying {amount} with PayPal") class ShoppingCart: def __init__(self, payment_strategy: PaymentStrategy): self.payment_strategy = payment_strategy def checkout(self, amount): self.payment_strategy.pay(amount) # Bad: Modifying the ShoppingCart class directly class ShoppingCartBad: def checkout(self, amount, payment_method): if payment_method == "credit_card": print(f"Paying {amount} with credit card") elif payment_method == "paypal": print(f"Paying {amount} with PayPal") else: print("Invalid payment method") cart = ShoppingCart(CreditCardPayment()) cart.checkout(100) """ ### 3.3 Liskov Substitution Principle (LSP) Subtypes must be substitutable for their base types without altering the correctness of the program. **Do This:** * Ensure that subclasses correctly implement the behavior of their base classes. * Avoid introducing unexpected side effects in subclasses. **Don't Do This:** * Create subclasses that violate the contract of their base classes. **Why:** Prevents unexpected behavior and ensures that polymorphism works correctly. **Example (violating Liskov Substitution ):** """python class Rectangle: def __init__(self, width, height): self.width = width self.height = height def set_width(self, width): self.width = width def set_height(self, height): self.height = height def area(self): return self.width * self.height class Square(Rectangle): #violates LSP as Square's invariant is width == height def __init__(self, size): super().__init__(size, size) def set_width(self, width): self.width = width self.height = width def set_height(self, height): self.width = height self.height = height def print_area(rectangle: Rectangle): rectangle.set_width(5) rectangle.set_height(4) print(rectangle.area()) rectangle = Rectangle(2, 3) print_area(rectangle) # Output: 20 square = Square(2) print_area(square) # Output: 16 (incorrect if we expect a standard rectangle behavior) """ In this example, the "Square" class violates LSP because setting the width or height also sets the other dimension, which is not the behavior expected of a generic "Rectangle". ### 3.4 Interface Segregation Principle (ISP) Clients should not be forced to depend upon interfaces that they do not use. **Do This:** * Create small, specific interfaces instead of large, general-purpose ones. * Refactor interfaces to separate unrelated methods. **Don't Do This:** * Force classes to implement methods they don't need. **Why:** Reduces dependencies and improves code flexibility. **Example (Python ISP):** """python # Good: Separate interfaces for different functionalities from abc import ABC, abstractmethod class Printer(ABC): @abstractmethod def print_document(self, document): pass class Scanner(ABC): @abstractmethod def scan_document(self, document): pass class Copier(ABC): @abstractmethod def copy_document(self, document): pass # Bad: One large interface with all functionalities mixed class MultiFunctionDevice(ABC): @abstractmethod def print_document(self, document): pass @abstractmethod def scan_document(self, document): pass @abstractmethod def copy_document(self, document): pass class SimplePrinter(Printer): def print_document(self, document): print(f"Printing {document}") class AllInOnePrinter(Printer, Scanner, Copier): def print_document(self, document): print(f"Printing {document}") def scan_document(self, document): print(f"Scanning {document}") def copy_document(self, document): print(f"Copying {document}") """ A client needing only printing should not depend on the "Scanner" or "Copier" methods. ### 3.5 Dependency Inversion Principle (DIP) High-level modules should not depend on low-level modules. Both should depend on abstractions. Abstractions should not depend on details. Details should depend on abstractions. **Do This:** * Use dependency injection to provide dependencies to components. * Program to interfaces rather than concrete implementations. **Don't Do This:** * Hardcode dependencies within components. **Why:** Increases code flexibility and testability. **Example (Python DIP):** """python # Good: Using dependency injection class Switchable: def turn_on(self): raise NotImplementedError def turn_off(self): raise NotImplementedError class LightBulb(Switchable): def turn_on(self): print("LightBulb: turned on...") def turn_off(self): print("LightBulb: turned off...") class ElectricPowerSwitch: def __init__(self, client: Switchable): self.client = client self.on = False def press(self): if self.on: self.client.turn_off() self.on = False else: self.client.turn_on() self.on = True # Bad: Hardcoded dependency class SwitchBad: def __init__(self): self.bulb = LightBulb() #Concrete dependency = Bad self.on = False def press(self): if self.on: self.bulb.turn_off() self.on = False else: self.bulb.turn_off() self.on = True bulb = LightBulb() switch = ElectricPowerSwitch(bulb) #Dependency Injection switch.press() switch.press() """ ## 4. Fly.io Specific Considerations ### 4.1 Using Fly.io Volumes Components that require persistent storage should leverage Fly.io Volumes. **Do This:** * Mount volumes to specific directories in your Fly.io instances. * Use volumes to store data that needs to persist across deployments. **Don't Do This:** * Store persistent data within the container's filesystem, risking data loss on restarts. **Why:** Volumes provide reliable and persistent storage for your applications. **Example (Fly.io Volume Configuration in "fly.toml"):** """toml app = "my-data-app" primary_region = "ord" [build] [deploy] release_command = "/app/migrate_db.sh" [[mounts]] source = "data_volume" # Existing volume name destination = "/data" # Where the volume is mounted [http_service] internal_port = 8080 force_https = true auto_stop_machines = true auto_start_machines = true min_machines_running = 1 """ ### 4.2 Fly.io Secrets Management Securely manage sensitive information using Fly.io Secrets. **Do This:** * Store API keys, database passwords, and other sensitive data as Fly.io Secrets. * Access secrets in your code using environment variables. **Don't Do This:** * Hardcode secrets in your code or configuration files. * Commit secrets to your version control system. **Why:** Protects sensitive data and prevents unauthorized access. **Example (Accessing Fly.io Secret in Python):** """python import os database_password = os.environ.get("DATABASE_PASSWORD") # Use the password to connect to the database print(f"Connecting to database with password: {database_password}") """ ### 4.3 Fly.io Edge Network and Global Distribution Leverage Fly.io's edge network for improved performance. **Do This:** * Configure your services to take full advantage of the Fly.io global network. * Utilize region pinning when needing to ensure consistency as a trade-off. **Don't Do This:** * Ignore latency implications of not using Fly.io's global network effectively. **Why:** Reduced latency provides a better user experience ## 5. Component Communication ### 5.1 REST APIs Use REST APIs for synchronous communication between components. **Do This:** * Design REST APIs using standard HTTP methods and status codes. * Use a consistent API versioning strategy. * Implement proper authentication and authorization for API endpoints. **Don't Do This:** * Expose internal implementation details through the API. * Create overly complex or inconsistent APIs. **Why:** REST APIs are well-established and easy to understand, enabling interoperability ### 5.2 Message Queues (e.g. Redis, NATS) Use message queues for asynchronous communication between components. **Do This:** * Choose a message queue that fits your application's needs (e.g., Redis, RabbitMQ, NATS). * Design message formats that are easy to serialize and deserialize. * Implement error handling and retry mechanisms for message processing. **Don't Do This:** * Use message queues for synchronous operations that require immediate responses. * Create overly complex messaging topologies. **Why:** Message queues enable decoupling, asynchronous processing, and improved scalability. Fly.io makes it easy to deploy Redis and NATS in a colocated fashion. ### 5.3 gRPC Consider gRPC for high-performance communication between internal components. **Do This:** * Define gRPC services using Protocol Buffers. * Generate code for both client and server using gRPC tools. * Implement proper error handling and logging. **Don't Do This:** * Use gRPC for external APIs that need to be easily accessible to a wide range of clients. * Overcomplicate gRPC service definitions. **Why:** gRPC provides high performance, efficient serialization, and strong typing. It typically requires more sophistication than REST. ## 6. Testing ### 6.1 Unit Testing Write unit tests for all components to verify their functionality in isolation. **Do This:** * Use a testing framework appropriate for your language (e.g., pytest for Python, JUnit for Java). * Write tests that cover all possible code paths and edge cases. * Use mocks and stubs to isolate components from their dependencies. **Don't Do This:** * Skip unit testing or write tests that are too superficial. * Write tests that are tightly coupled to the implementation details of the tested components. **Why:** Unit tests ensure that components function correctly and prevent regressions. ### 6.2 Integration Testing Write integration tests to verify the interaction between different components. **Do This:** * Test the communication between components using real or simulated dependencies. * Verify that data is correctly passed between components and that the overall system behaves as expected. **Don't Do This:** * Skip integration testing or write tests that are too narrow in scope. * Rely solely on unit tests without verifying how components work together. **Why:** Integration tests ensure that components work together correctly. ### 6.3 End-to-End Testing Write end-to-end tests to verify the entire application flow from the user interface to the backend. **Do This:** * Use a testing framework that simulates user interactions (e.g., Selenium, Cypress). * Test the entire application flow from the user interface to the backend. * Verify that the application meets the user's requirements. **Don't Do This:** * Skip end-to-end testing or write tests that are too complex and brittle. * Rely solely on unit and integration tests without verifying the end-to-end user experience. **Why:** End-to-end tests ensure that the application meets the user's requirements and provides a good user experience. ## 7. Monitoring and Logging ### 7.1 Centralized Logging Use a centralized logging system to collect and analyze logs from all components. **Do This:** * Use a logging framework appropriate for your language (e.g., log4j for Java, logging for Python). * Configure components to log all important events, including errors, warnings, and informational messages. * Use a tool such as Grafana Loki or similar system for log aggregation. **Don't Do This:** * Skip logging or rely solely on local log files. * Log sensitive data such as passwords or API keys. **Why:** Centralized logging enables easier troubleshooting, performance monitoring, and security analysis. ### 7.2 Metrics Collection Collect metrics from all components to monitor their performance and resource usage. **Do This:** * Use a metrics library appropriate for your language (e.g., Prometheus client libraries). * Collect metrics such as CPU usage, memory usage, network traffic, and request latency. * Use a monitoring system such as Prometheus or Grafana to visualize and analyze metrics. **Don't Do This:** * Skip metrics collection or collect only a limited set of metrics. * Use metrics that are not meaningful or actionable. **Why:** Metrics provide valuable insights into the health and performance of your components. ### 7.3 Tracing Implement distributed tracing to track requests as they flow through different components. **Do This:** * Use a tracing library such as Jaeger or Zipkin. * Instrument code to generate spans for each request as it enters and exits a component. * Use a tracing backend to collect and visualize traces. **Don't Do This:** * Skip tracing or trace only a limited set of requests. * Create traces that are too granular or lack context. **Why:** Tracing enables you to identify performance bottlenecks and diagnose issues in distributed systems. Fly.io has solid support for well created tracing setups.
# API Integration Standards for Fly.io This document outlines coding standards for API integration within the Fly.io ecosystem. It provides guidelines for connecting to backend services and external APIs, emphasizing maintainability, performance, and security. It is intended to be used as a central reference for developers and as context for AI coding assistants. It reflects best practices as of late 2024 and will be updated regularly as the Fly.io platform evolves. ## 1. General Principles ### 1.1. Idempotency * **Do This:** Ensure that API calls are idempotent where applicable. This is especially important for operations that modify data, such as creating or updating records. Use UUIDs or similar unique identifiers for requests to enable retries without unintended side effects. * **Don't Do This:** Assume that every API call succeeds on the first attempt. Network issues or server errors can lead to failed requests, and retries may be necessary. **Why this matters:** Idempotency ensures that repeated API requests have the same effect as a single request. This enhances system reliability, especially in distributed environments like Fly.io, where network hiccups are possible. **Code Example (Go):** """go package main import ( "fmt" "net/http" "bytes" "log" "github.com/google/uuid" ) func createResource(url string, data []byte, idempotencyKey string) error { client := &http.Client{} req, err := http.NewRequest("POST", url, bytes.NewBuffer(data)) if err != nil { return err } req.Header.Set("Content-Type", "application/json") req.Header.Set("Idempotency-Key", idempotencyKey) // Add Idempotency Key resp, err := client.Do(req) if err != nil { return err } defer resp.Body.Close() if resp.StatusCode >= 200 && resp.StatusCode < 300 { fmt.Println("Resource created successfully.") return nil } else { return fmt.Errorf("Failed to create resource. Status code: %d", resp.StatusCode) } } func main() { resourceData := []byte("{"name": "Example Resource"}") resourceURL := "https://api.example.com/resources" // Replace with actual API endpoint idempotencyKey := uuid.New().String() err := createResource(resourceURL, resourceData, idempotencyKey) if err != nil { log.Fatalf("Failed to create resource: %v", err) } } """ ### 1.2. Error Handling * **Do This:** Implement robust error handling for API calls. Log errors with sufficient context for debugging, including request details, response codes, and error messages. Use structured logging formats like JSON for easier analysis. Implement exponential backoff for retries in case of transient errors. * **Don't Do This:** Ignore errors or simply print error messages to the console. This makes it difficult to diagnose and resolve issues. **Why this matters:** Proper error handling ensures that your application can gracefully recover from failures and provides valuable insights into system behavior. **Code Example (Python):** """python import requests import logging import time import json import os # Configure logging (adjust level as needed) logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s') def call_api_with_retry(url, data, max_retries=3, backoff_factor=2): """ Calls an API endpoint with exponential backoff retry logic. """ for attempt in range(max_retries): try: response = requests.post(url, json=data) response.raise_for_status() # Raise HTTPError for bad responses (4xx or 5xx) return response.json() except requests.exceptions.RequestException as e: logging.error(f"API request failed (attempt {attempt + 1}/{max_retries}): {e}") if attempt < max_retries - 1: wait_time = backoff_factor ** attempt logging.info(f"Retrying in {wait_time} seconds...") time.sleep(wait_time) else: logging.error("Max retries reached. API call failed.") raise #Re-raise the exception after the last retry def main(): api_url = os.environ.get("MY_API_URL", "https://example.com/api") payload = {"message": "Hello from Fly.io!"} try: result = call_api_with_retry(api_url, payload) logging.info(f"API response: {result}") except Exception as e: logging.exception("An unrecoverable error occurred during API call.") #Use logging.exception to get the traceback if __name__ == "__main__": main() """ * Note the use of "os.environ.get" to retrieve the API URL. This is critical for Fly.io deployment, making the configuration dynamic and avoiding hardcoding. The added "logging.exception" in the "except" block is crucial for capturing the full stack trace, simplifying debugging. ### 1.3. Security * **Do This:** Securely store and manage API keys and other sensitive credentials using Fly.io secrets. Avoid hardcoding credentials directly in your application code or configuration files. Implement proper authentication and authorization mechanisms to protect your APIs from unauthorized access. Enforce TLS/SSL encryption for all API communication. * **Don't Do This:** Commit credentials to your Git repository or expose them in client-side code. **Why this matters:** Protecting sensitive credentials and enforcing access control are essential for preventing security breaches and protecting user data. **Code Example (Node.js):** """javascript // Requires the 'node-fetch' package (npm install node-fetch) import fetch from 'node-fetch'; async function callApi() { const apiKey = process.env.MY_API_KEY; // Retrieve API key from Fly.io secrets if (!apiKey) { console.error("API key not found in environment variables."); return; } try { const response = await fetch('https://api.example.com/data', { // Replace with your API endpoint headers: { 'Authorization': "Bearer ${apiKey}", 'Content-Type': 'application/json' }, method: 'GET' // Or POST, PUT, DELETE as needed }); if (!response.ok) { console.error("API request failed with status: ${response.status}"); const errorData = await response.json(); // Attempt to parse the error response console.error("Error data:", errorData); return; } const data = await response.json(); console.log('API Response:', data); } catch (error) { console.error('Error calling API:', error); } } callApi(); """ * In this example, the API key is retrieved from the "process.env" object, corresponding to a Fly.io secret. Never hardcode the API key in the source code. The example now includes parsing the response and printing the error message if the API request fails, making debugging significantly easier. * Remember to set the secret using "flyctl secrets set MY_API_KEY=<your_api_key>". ### 1.4. Rate Limiting * **Do This:** Implement rate limiting to protect your APIs from abuse and prevent resource exhaustion. Use a sliding window or token bucket algorithm to enforce rate limits. Provide informative error messages to clients when they exceed the rate limit. * **Don't Do This:** Allow unlimited API requests, as this can lead to denial-of-service attacks or unexpected costs. **Why this matters:** Rate limiting protects your APIs from abuse, ensures fair resource allocation, and prevents your application from being overwhelmed by excessive traffic. **Implementation:** Rate limiting can be implemented at various levels, including: * **Application Level:** Using middleware or custom code to track and limit requests based on IP address, user ID, or API key. * **Fly.io CDN:** Leverage the Fly.io CDN features for basic rate limiting. * **External API Gateway:** Use a dedicated API gateway service (e.g., Kong, Tyk) for advanced rate limiting and other API management features. Consider using a library like "go-rate" (for Go) or "Flask-Limiter" (for Python) to simplify rate limiting implementation. ### 1.5. Data Serialization * **Do This:** Use a consistent data serialization format (e.g., JSON, Protocol Buffers) for API communication. Define clear schemas for request and response payloads to ensure data integrity and facilitate validation. * **Don't Do This:** Use inconsistent or poorly defined data formats, as this can lead to parsing errors and interoperability issues. **Why this matters:** Consistent data serialization ensures that data can be easily exchanged between different systems and programming languages. Clear schemas improve data validation and reduce the risk of errors. **Example (JSON schema):** """json { "type": "object", "properties": { "userId": { "type": "integer", "description": "Unique identifier for the user" }, "username": { "type": "string", "minLength": 3, "maxLength": 50, "description": "User's username" }, "email": { "type": "string", "format": "email", "description": "User's email address" } }, "required": [ "userId", "username", "email" ] } """ Use libraries like "jsonschema" (Python) or "ajv" (JavaScript) to validate data against a JSON schema. ## 2. Fly.io Specific Considerations ### 2.1. Fly.io Secrets * **Do This:** Use Fly.io secrets (accessed via environment variables) for any configuration value that should not be checked into source control, especially API keys, database passwords, and other credentials. * **Don't Do This:** Hardcode secrets into your source code or configuration files. **Code Example (Python):** """python import os api_key = os.environ.get("MY_API_KEY") if not api_key: print("API key not found in environment.") # Handle the missing key appropriately, e.g., exit or use a default # Use api_key in your API calls """ ### 2.2. Fly.io Regions * **Do This:** Design your API integrations to be region-aware. If your backend services are deployed in multiple Fly.io regions, use the "FLY_REGION" environment variable to determine the optimal region for connecting to those services. Consider using a service discovery mechanism to dynamically locate the nearest instance of your backend services. * **Don't Do This:** Hardcode specific region URLs or IP addresses, as this can lead to performance issues and increased latency. **Code Example (Go):** """go package main import ( "fmt" "os" ) func main() { flyRegion := os.Getenv("FLY_REGION") if flyRegion == "" { fmt.Println("FLY_REGION environment variable not set.") return } var apiEndpoint string switch flyRegion { case "ams": apiEndpoint = "https://api.example.com/ams" // Amsterdam case "iad": apiEndpoint = "https://api.example.com/iad" // Washington, D.C. case "sjc": apiEndpoint = "https://api.example.com/sjc" // San Jose default: apiEndpoint = "https://api.example.com/default" // Default region } fmt.Printf("Connecting to API endpoint: %s\n", apiEndpoint) // Your API call logic here, using apiEndpoint. } """ ### 2.3. Fly.io Private Networking * **Do This:** Utilize Fly.io's private networking features to securely communicate between your applications and backend services. Deploy your backend services within the same Fly.io organization and use internal DNS names (e.g., "<app-name>.internal") to access them. * **Don't Do This:** Expose your backend services directly to the public internet if they are only intended for internal use. **Example "fly.toml" configuration for internal service:** """toml app = "my-backend-service" primary_region = "iad" [http_service] internal_port = 8080 force_https = false auto_stop_machines = true auto_start_machines = true min_machines_running = 1 [[services]] internal_port = 8080 protocol = "tcp" processes = ["app"] [[services.ports]] port = 8080 handlers = ["tls", "http"] """ Access this service from another Fly.io app named "my-frontend-app" using "http://my-backend-service.internal:8080". No need to expose public ports. ### 2.4. Fly.io Volumes * **Do This:** Consider using Fly.io volumes to persist data that is generated or consumed by your API integrations, such as caches, logs, or temporary files. This ensures data durability across application restarts and deployments. * **Don't Do This:** Store sensitive data directly on the volume without proper encryption and access controls. **Example "fly.toml" configuration for attaching a volume:** """toml app = "my-api-app" primary_region = "iad" [build] [deploy] [env] [mounts] source = "my_volume" destination = "/data" """ Then, inside your application, you can access the volume at the "/data" path. Remember to create the volume using "flyctl volumes create my_volume --region iad --size 10" first. ### 2.5. Health Checks * **Do This:** Implement comprehensive health checks for your API integrations. Ensure that your health checks verify not only that your application is running but also that it can successfully connect to and communicate with all required backend services. This will allow Fly.io to automatically restart unhealthy instances. * **Don't Do This:** Rely solely on basic "ping" health checks that only verify that the application process is running. **Example (Fly.toml health check):** """toml [http_service] internal_port = 8080 force_https = true auto_stop_machines = true auto_start_machines = true min_machines_running = 1 processes = ["app"] [http_service.checks] path = "/healthz" interval = "15s" # Check every 15 seconds timeout = "2s" # Timeout after 2 seconds grace_period = "5s" # Wait 5 seconds before starting checks """ And corresponding Go "/healthz" handler: """go package main import ( "net/http" "fmt" ) func healthzHandler(w http.ResponseWriter, r *http.Request) { // Add logic to check connections to backend services here. // For example, check DB connection, external API connection, etc. // For demonstration purpose, simply returning 200 OK. fmt.Fprint(w, "OK") } func main() { http.HandleFunc("/healthz", healthzHandler) http.ListenAndServe(":8080", nil) } """ ## 3. Patterns for Connecting to Backend Services and External APIs ### 3.1. API Gateway Pattern * **Do This:** Use an API gateway to centralize API management, routing, authentication, authorization, and other cross-cutting concerns. This improves security, simplifies application development, and enables features like rate limiting and request transformation. * **Don't Do This:** Expose your backend services directly to the public internet without an API gateway. **Implementation:** You can use a dedicated API gateway service (e.g., Kong, Tyk, Ambassador), or you can build a lightweight API gateway using a reverse proxy like Nginx or Traefik. Fly.io integrates well with these options. ### 3.2. Backend for Frontend (BFF) Pattern * **Do This:** Create separate backend services tailored to the specific needs of different frontends (e.g., web, mobile). This allows you to optimize data retrieval, transformation, and presentation for each frontend, improving performance and user experience. * **Don't Do This:** Use a single monolithic backend service that serves all frontends, as this can lead to unnecessary complexity and performance bottlenecks. **Implementation:** Deploy separate backend services for each frontend type, each with its own API endpoints and data models. ### 3.3. Circuit Breaker Pattern * **Do This:** Implement the circuit breaker pattern to prevent cascading failures when communicating with external APIs. This involves monitoring the success rate of API calls and automatically opening the circuit breaker if the failure rate exceeds a certain threshold. When the circuit breaker is open, subsequent API calls are immediately failed without even attempting to connect to the external API. Periodically, the circuit breaker will attempt to "half-open" and try a single API call to see if the external API has recovered. * **Don't Do This:** Allow your application to continuously attempt to connect to a failing external API, as this can lead to resource exhaustion and application instability. **Why this matters:** The Circuit Breaker pattern is crucial for building resilient applications that can gracefully handle failures in external services. **Code Example (Go using "github.com/sony/gobreaker"):** """go package main import ( "fmt" "net/http" "time" "github.com/sony/gobreaker" "log" "errors" ) var cb *gobreaker.CircuitBreaker func init() { settings := gobreaker.Settings{ Name: "my-api", MaxRequests: 5, //Allow 5 requests to pass through, then start circuit breaking Interval: 10 * time.Second, // Period for polling results Timeout: 3 * time.Second, // Timeout for the API call ReadyToTrip: func(counts gobreaker.Counts) bool { // Determine whether to trip the circuit breaker. failureRatio := float64(counts.TotalFailures) / float64(counts.Requests) return counts.Requests >= 10 && failureRatio >= 0.6 //Trip after 10 requests with 60% failure rate }, OnStateChange: func(name string, from gobreaker.State, to gobreaker.State) { log.Printf("Circuit Breaker %s changed from %s to %s\n", name, from, to) }, } cb = gobreaker.NewCircuitBreaker(settings) } func callExternalAPI(url string) (string, error) { // Simulate an external API call resp, err := http.Get(url) if err != nil { return "", err } defer resp.Body.Close() if resp.StatusCode >= 200 && resp.StatusCode < 300 { return "API call successful!", nil } //Simulate an API that sometimes fails if resp.StatusCode == 500 { return "", errors.New("Simulated 500 error") } return "", fmt.Errorf("API call failed with status: %d", resp.StatusCode) } func handleAPIRequest() (string, error) { result, err := cb.Execute(func() (interface{}, error) { return callExternalAPI("https://example.com/api") // Replace with your actual API endpoint }) if err != nil { return "", fmt.Errorf("Circuit Breaker Error: %v", err) } return result.(string), nil //Type assertion after cb.Execute } func main() { for i := 0; i < 20; i++ { result, err := handleAPIRequest() if err != nil { fmt.Printf("Request %d failed: %v\n", i, err) } else { fmt.Printf("Request %d successful: %s\n", i, result) } time.Sleep(200 * time.Millisecond) } time.Sleep(10 * time.Second) //Give circuit breaker time to half-open fmt.Println("Testing after some time...") result, err := handleAPIRequest() if err != nil { fmt.Printf("Request after waiting failed: %v\n", err) } else { fmt.Printf("Request after waiting successful: %s\n", result) } } """ Key improvements and explanations: * **Clearer Circuit Breaker Configuration:** The "gobreaker.Settings" struct is used to configure the circuit breaker. * **"MaxRequests"**: This defines how many requests are allowed to pass through *before* the circuit breaker starts actively monitoring for failures. * **"Interval"**: This defines how often the circuit breaker aggregates request results to determine if it should trip (open). * **"ReadyToTrip"**: This is the most important part: a function that *determines* whether to trip the circuit breaker. Previously, it was relying on just "counts.TotalFailures > 5". Now, it only trips if *both* of the following are true: * At least 10 requests have been made ("counts.Requests >= 10"). This avoids tripping prematurely due to a single initial failure. * The failure rate is 60% or higher ("failureRatio >= 0.6"). * **"OnStateChange"**: A function that gets called anytime the circuit breaker changes state (Closed, Open, Half-Open). This is useful for logging and monitoring. ### 3.4. Asynchronous Communication (Queues) * **Do This:** Use message queues (e.g., Redis Queue, RabbitMQ) for asynchronous communication between your applications and backend services. This decouples your applications, improves scalability, and enhances resilience. * **Don't Do This:** Rely solely on synchronous API calls, as this can lead to blocking operations and performance bottlenecks. **Implementation:** Use a message queue library to publish messages to a queue and consume messages from the queue in your backend services. Fly.io can easily run Redis or RabbitMQ instances. ## 4. API Versioning * **Do This:** Implement API versioning to maintain backward compatibility as your APIs evolve. Use a version number in the API endpoint URL (e.g., "/api/v1/users") or in the request headers (e.g., "Accept: application/json; version=1"). * **Don't Do This:** Make breaking changes to your APIs without introducing a new version, as this can break existing clients. **Example using URL versioning:** """ https://api.example.com/v1/users https://api.example.com/v2/users """ **Example using header versioning:** """ Accept: application/json; version=1 """ The server must inspect the "Accept" header and route the request to the appropriate version handler. ## 5. Monitoring and Logging * **Do This:** Implement comprehensive monitoring and logging for your API integrations. Track key metrics such as request latency, error rates, and throughput. Use a centralized logging system to collect and analyze logs from all your applications. * **Don't Do This:** Neglect monitoring and logging, as this makes it difficult to identify and resolve issues. **Implementation:** Use a monitoring tool like Prometheus or Grafana to collect and visualize metrics. Use a logging system like ELK stack (Elasticsearch, Logstash, Kibana) or Splunk to collect and analyze logs. Fly.io provides excellent tools for monitoring. Consider setting up Grafana and Prometheus on Fly.io for in-depth analytics. This coding standards document provides a comprehensive guide to API integration within the Fly.io ecosystem. By following these standards, you can build robust, scalable, and secure applications that leverage the full potential of the Fly.io platform. Remember to regularly review and update these standards as the Fly.io platform evolves.
# Performance Optimization Standards for Fly.io This document outlines the coding standards focused on performance optimization for applications deployed on Fly.io. Adhering to these standards will lead to faster, more responsive, and resource-efficient applications. These standards are tailored for the latest version of Fly.io and incorporate modern approaches for optimal performance within the Fly.io ecosystem. ## 1. Architectural Considerations for Performance ### 1.1. Region Selection and Geographic Distribution **Standards:** * **Do This:** Deploy your application to multiple regions closest to your users. Use Fly.io's built-in support for global deployments to minimize latency. * **Don't Do This:** Deploy only to a single region, especially if your user base is geographically distributed. **Why:** Reduces latency by serving users from the nearest available region. Improves availability by distributing load across multiple regions. **Code Example (fly.toml):** """toml app = "my-fly-app" primary_region = "iad" # Initial region [regions] [[regions.group]] codes = ["iad", "lhr", "syd"] #Expand reach source = "primary" console_command = "/app/bin/my-fly-app migrate" [build] [deploy] release_command = "/app/bin/my-fly-app migrate" strategy = "rolling" [http_service] internal_port = 8080 force_https = true auto_stop_machines = true auto_start_machines = true min_machines_running = 1 processes = ["app"] [[http_service.ports]] port = 80 handlers = ["http"] [[http_service.ports]] port = 443 handlers = ["tls", "http"] [experimental] allowed_public_ports = [] [[services]] protocol = "tcp" internal_port = 8080 processes = ["app"] [[services.ports]] port = 80 handlers = ["http"] [[services.ports]] port = 443 handlers = ["tls", "http"] """ **Anti-Pattern:** Hardcoding region-specific logic into the application code. Use Fly.io's configuration and routing features instead. ### 1.2. Database Proximity **Standards:** * **Do This:** Locate your database (e.g., Postgres, Redis) in the same region as your application servers whenever possible to minimize network latency. Consider using Fly.io's managed Postgres or Redis services. * **Don't Do This:** Access a database across regions unless absolutely necessary. **Why:** Reduces latency for database queries, improving overall application responsiveness. **Code Example (Connecting to Fly.io Postgres):** """python import psycopg2 import os # Fetch database credentials from environment variables db_host = os.environ.get("FLY_POSTGRES_FQDN") db_name = os.environ.get("PGDATABASE") db_user = os.environ.get("PGUSER") db_password = 'your_db_password' # Better to get this from a secret try: conn = psycopg2.connect( host=db_host, database=db_name, user=db_user, password=db_password, port=5432 # Usually 5432 for PostgreSQL ) print("Database connection successful") cur = conn.cursor() cur.execute("SELECT version();") db_version = cur.fetchone() print(f"PostgreSQL version: {db_version}") cur.close() conn.close() except psycopg2.Error as e: print(f"Error connecting to database: {e}") """ **Anti-Pattern:** Ignoring database latency. Profile database queries to identify and optimize slow operations. ### 1.3. Caching Strategies **Standards:** * **Do This:** Implement caching at multiple levels: browser, CDN (using Fly.io's global edge network), application server (in-memory), and database (query caching). Use appropriate cache invalidation strategies. Implement HTTP caching headers (e.g., "Cache-Control", "Expires"). * **Don't Do This:** Rely solely on database caching. Cache frequently accessed data closer to the user. **Why:** Reduces load on application servers and databases, resulting in faster response times and lower resource utilization. **Code Example (HTTP Caching with Flask):** """python from flask import Flask, make_response app = Flask(__name__) @app.route('/') def index(): response = make_response("<h1>Hello, World!</h1>") response.headers['Cache-Control'] = 'public, max-age=3600' # Cache for 1 hour return response if __name__ == '__main__': app.run(debug=True) """ **Anti-Pattern:** Aggressively caching dynamic content. Use appropriate cache invalidation techniques when data changes. ### 1.4. Connection Pooling **Standards:** * **Do This:** Use connection pooling for database connections to reduce the overhead of establishing new connections for each request. * **Don't Do This:** Create a new database connection for every request, especially under high load. **Why:** Reduces database load and improves application response time by reusing existing connections. **Code Example (Connection Pooling with SQLAlchemy):** """python from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker import os db_host = os.environ.get("FLY_POSTGRES_FQDN") db_name = os.environ.get("PGDATABASE") db_user = os.environ.get("PGUSER") db_password = 'your_db_password' # get this from a secrets manager! # Database URL (adjust username, password, host, and database name) db_url = f"postgresql://{db_user}:{db_password}@{db_host}/{db_name}" # Create a database engine with connection pooling engine = create_engine(db_url, pool_size=5, max_overflow=10) # Adjust pool_size and max_overflow # Create a session factory Session = sessionmaker(bind=engine) # Example Usage: def get_data_from_db(): session = Session() try: # Perform database operations using the session # Example: # results = session.query(MyTable).all() print("Querying the DB... Replace with your actual query here") except Exception as e: print(f"Error during database operation: {e}") finally: session.close() # Always close the session! if __name__ == '__main__': get_data_from_db() """ **Anti-Pattern:** Setting the connection pool size too small or too large. Tune based on application load and database capacity. ## 2. Code-Level Optimizations ### 2.1. Efficient Data Structures and Algorithms **Standards:** * **Do This:** Choose appropriate data structures (e.g., dictionaries, sets) and algorithms (e.g., sorting algorithms, search algorithms) for the specific task. Optimize for time and space complexity appropriately. * **Don't Do This:** Use inefficient data structures or algorithms that lead to slow execution or high memory consumption. **Why:** Improves application performance by minimizing resource usage and execution time. **Code Example (Using Sets for Efficient Membership Testing):** """python my_list = [1, 2, 3, 4, 5] #Original Data my_set = set(my_list) # Convert to Set #Checking for membership is much faster in sets, if you only need this functionality if 3 in my_set: print("3 exists in my_set") if 6 in my_set: print("6 exists in my_set") else : print("6 does not exist in my_set") """ **Anti-Pattern:** Linear search on large, unsorted lists. Consider using binary search or hash tables. ### 2.2. Asynchronous Operations **Standards:** * **Do This:** Use asynchronous operations (e.g., async/await in Python, Promises in JavaScript) for I/O-bound tasks such as network requests, file I/O, and database queries to avoid blocking the main thread. * **Don't Do This:** Perform blocking I/O operations on the main thread. **Why:** Prevents blocking the event loop, allowing the application to handle more requests concurrently. Improves responsiveness and throughput. **Code Example (Asynchronous HTTP Request with Python aiohttp):** """python import asyncio import aiohttp async def fetch_data(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: return await response.text() async def main(): data = await fetch_data('https://example.com') print(data[:100]) # Print the first 100 characters if __name__ == '__main__': asyncio.run(main()) """ **Anti-Pattern:** Mixing synchronous and asynchronous code without proper thread management. Use appropriate executors or thread pools. ### 2.3. Resource Management **Standards:** * **Do This:** Explicitly release resources such as file handles, database connections, and memory as soon as they are no longer needed. Use "try...finally" blocks or context managers ("with" statement in Python) to ensure proper resource cleanup. Utilize Fly.io's autoscaling to efficiently use resources. Consider autoscaling to zero during off-peak hours. * **Don't Do This:** Leak resources, which can lead to memory exhaustion or other performance problems. **Why:** Prevents resource leaks, ensuring efficient utilization of system resources. Improves application stability and scalability. **Code Example (Using "with" Statement for File Handling):** """python try: with open('my_file.txt', 'r') as f: data = f.read() print(data) except FileNotFoundError:
# Code Style and Conventions Standards for Fly.io This document outlines the coding style and conventions to be followed when developing applications for the Fly.io platform. Adhering to these standards ensures code maintainability, readability, and consistency, leading to improved collaboration and reduced debugging efforts. These guidelines also optimize application performance and security within the Fly.io environment. ## 1. General Principles ### 1.1 Consistency **Do This:** Maintain a consistent coding style across the entire project. Use a linter and formatter to enforce these rules automatically. **Don't Do This:** Mix different coding styles within the same file or project. **Why:** Consistency improves readability and reduces cognitive load when developers work on different parts of the application. It also helps AI coding assistants to generate code that fits seamlessly with existing code. ### 1.2 Readability **Do This:** Write code that is easy to understand and follow. Use meaningful variable and function names, and add comments where necessary. **Don't Do This:** Write overly complex or cryptic code that is difficult to decipher. **Why:** Readability is crucial for maintainability. Code should be self-documenting where possible. ### 1.3 Brevity **Do This:** Keep code concise and avoid unnecessary complexity. Use language features to express ideas clearly and efficiently. **Don't Do This:** Write verbose or repetitive code that can be simplified. **Why:** Brevity reduces the size of the codebase, makes it easier to understand, and can improve performance by reducing the amount of code that needs to be executed. ### 1.4 Testability **Do This:** Write code that is easy to test. Use dependency injection and other techniques to decouple components. **Don't Do This:** Write tightly coupled code that is difficult to isolate and test. **Why:** Testability ensures that the application functions correctly and reduces the risk of introducing bugs. Automated tests are essential for continuous integration and deployment. ## 2. Formatting ### 2.1 Indentation and Spacing **Do This:** Use 4 spaces for indentation. Use consistent spacing around operators and after commas. **Don't Do This:** Use tabs for indentation. Omit spaces around operators or after commas. **Example (Python):** """python def calculate_total(price, quantity, tax_rate=0.07): """Calculates the total cost including tax.""" subtotal = price * quantity tax = subtotal * tax_rate total = subtotal + tax return total """ **Example (Go):** """go package main import "fmt" func calculateTotal(price float64, quantity int, taxRate float64) float64 { subtotal := price * float64(quantity) tax := subtotal * taxRate total := subtotal + tax return total } func main() { fmt.Println(calculateTotal(19.99, 2, 0.08)) } """ **Why:** Consistent indentation and spacing improve readability and make it easier to visually parse the code structure. ### 2.2 Line Length **Do This:** Limit lines to a maximum of 120 characters. Break long lines into multiple lines using appropriate line breaks. **Don't Do This:** Write very long lines that require horizontal scrolling. **Why:** Limiting line length improves readability, especially when viewing code on different screen sizes or in diff tools. ### 2.3 Blank Lines **Do This:** Use blank lines to separate logical sections of code, such as function definitions, class definitions, and blocks of code within a function. **Don't Do This:** Use an excessive number of blank lines, or omit blank lines where they are needed. **Why:** Blank lines improve readability by visually separating different parts of the code. ### 2.4 File Encoding **Do This:** Use UTF-8 encoding for all source files. **Don't Do This:** Use other encodings that may not be universally supported. **Why:** UTF-8 is the standard encoding for text files and ensures that characters are displayed correctly across different platforms. ## 3. Naming Conventions ### 3.1 Variables **Do This:** Use descriptive and meaningful names for variables. Use camelCase for variable names (e.g., "userName", "orderTotal"). **Don't Do This:** Use single-letter variable names or cryptic abbreviations. **Example (JavaScript):** """javascript const userFirstName = "John"; const orderTotalAmount = 120.50; """ **Why:** Meaningful variable names make the code easier to understand and reduce the need for comments. ### 3.2 Functions and Methods **Do This:** Use descriptive verb-noun names for functions and methods. Use camelCase for function and method names (e.g., "getUserDetails", "calculateOrderTotal"). **Don't Do This:** Use vague or ambiguous names. **Example (Python):** """python def get_user_profile(user_id): """Retrieves user profile from the database.""" # ... implementation ... return profile def calculate_shipping_cost(order_subtotal, destination): """Calculates shipping cost to a certain destination""" # ... implementation ... return shipping_cost """ **Why:** Clear and descriptive function and method names make the code easier to understand and maintain. ### 3.3 Classes **Do This:** Use PascalCase for class names (e.g., "UserProfile", "OrderManager"). **Don't Do This:** Use lowercase or underscore-separated names for classes. **Example (Java):** """java public class UserProfile { private String userName; private String emailAddress; // ... methods ... } """ **Why:** PascalCase for class names is a common convention that improves code readability. ### 3.4 Constants **Do This:** Use SCREAMING_SNAKE_CASE for constant names (e.g., "MAX_RETRIES", "DEFAULT_TIMEOUT"). **Don't Do This:** Use lowercase or camelCase for constants. **Example (JavaScript):** """javascript const MAX_CONNECTIONS = 100; const API_ENDPOINT = "https://api.example.com/v1"; """ **Why:** SCREAMING_SNAKE_CASE clearly indicates that a variable is a constant and should not be modified. ## 4. Code Comments ### 4.1 Documentation Comments **Do This:** Use documentation comments to describe the purpose, parameters, and return values of functions, methods, and classes. Use a standard documentation format, such as JSDoc for JavaScript or docstrings for Python. **Don't Do This:** Omit documentation comments for important code elements. **Example (JavaScript with JSDoc):** """javascript /** * Retrieves user details from the database. * @param {string} userId - The ID of the user to retrieve. * @returns {Promise<object>} A promise that resolves to the user details. */ async function getUserDetails(userId) { // ... implementation ... return user; } """ **Example (Python with docstrings):** """python def process_order(order_id): """ Processes an order by updating the order status and sending a confirmation email. :param order_id: The ID of the order to process. :type order_id: str :raises OrderProcessingError: If an error occurs during order processing. :returns: None """ # ... implementation ... pass """ **Why:** Documentation comments provide valuable information for other developers and can be used to generate API documentation automatically. ### 4.2 Inline Comments **Do This:** Use inline comments to explain complex or non-obvious code. **Don't Do This:** Over-comment code that is already clear. Avoid stating the obvious. **Example (Go):** """go // Calculate the discount amount based on the order total. discount := orderTotal * discountRate """ **Why:** Inline comments can help to clarify the intent of the code and make it easier to understand. ## 5. Error Handling ### 5.1 Explicit Error Handling **Do This:** Handle errors explicitly using try-catch blocks or error return values. Log errors with sufficient context for debugging (using structured logging). **Don't Do This:** Ignore errors or rely on default error handling. **Example (Node.js with try/catch):** """javascript async function processPayment(paymentDetails) { try { const result = await paymentGateway.charge(paymentDetails); console.log({ message: "Payment successful", transactionId: result.transactionId }); return result; } catch (error) { console.error({ message: "Payment failed", error: error.message, paymentDetails }); throw new Error("Payment processing error"); } } """ **Example (Go with error return values):** """go func readFile(filename string) ([]byte, error) { data, err := os.ReadFile(filename) if err != nil { log.Printf("Error reading file %s: %v", filename, err) return nil, fmt.Errorf("failed to read file: %w", err) } return data, nil } func main() { content, err := readFile("myFile.txt") if err != nil { // Handle the error appropriately log.Fatalf("Could not read file %v", err) return } fmt.Println(string(content)) } """ **Why:** Explicit error handling ensures that errors are detected and handled gracefully, preventing application crashes and data loss. Logging provides valuable information for debugging and troubleshooting. ### 5.2 Custom Exceptions **Do This:** Create custom exceptions to represent specific error conditions in your application. **Don't Do This:** Use generic exceptions for all error conditions. **Example (Python):** """python class OrderProcessingError(Exception): """Custom exception for order processing errors.""" pass def process_order(order_id): try: # ... order processing logic ... if something_went_wrong: raise OrderProcessingError("Failed to process order") except OrderProcessingError as e: print(f"Error processing order: {e}") """ **Why:** Custom exceptions make the code more readable and allow for more specific error handling. ## 6. Security ### 6.1 Input Validation **Do This:** Validate all user inputs to prevent injection attacks. Use appropriate validation techniques for different types of inputs, such as regular expressions for strings and type checking for numbers. **Don't Do This:** Trust user inputs without validation. **Example (Node.js with input validation):** """javascript const validator = require('validator'); function createUser(userInput) { if (!validator.isEmail(userInput.email)) { throw new Error("Invalid email address"); } if (validator.isEmpty(userInput.password)) { throw new Error("Password cannot be empty"); } // ... create user logic ... } """ **Why:** Input validation is crucial for preventing security vulnerabilities such as SQL injection, cross-site scripting (XSS), and command injection. ### 6.2 Authentication and Authorization **Do This:** Implement secure authentication and authorization mechanisms to protect sensitive data and resources. Use established authentication protocols, such as OAuth 2.0 or JWT. **Don't Do This:** Roll your own authentication system. **Why:** Secure authentication and authorization are essential for protecting user data and preventing unauthorized access to sensitive resources. Always use industry-standard practices. ### 6.3 Secrets Management **Do This:** Store sensitive information, such as API keys and database passwords, securely using environment variables or a secrets management system. Fly.io provides built-in secret management. **Don't Do This:** Hardcode secrets in your code or store them in version control. **Example (Fly.io secrets):** Use the "flyctl secrets set" command to set secrets: """bash flyctl secrets set API_KEY=your_api_key DATABASE_URL=your_database_url """ Access secrets in your application: **Example (Go):** """go package main import ( "fmt" "os" ) func main() { apiKey := os.Getenv("API_KEY") databaseURL := os.Getenv("DATABASE_URL") fmt.Println("API Key:", apiKey) fmt.Println("Database URL:", databaseURL) // Your application logic here } """ **Why:** Storing secrets securely prevents unauthorized access to sensitive information and protects against security breaches. ### 6.4 Dependency Management **Do This:** Keep dependencies up to date to patch security vulnerabilities. Use a dependency management tool, such as npm, pip, or Go modules, to manage dependencies and track versions. **Don't Do This:** Use outdated dependencies with known security vulnerabilities. **Why:** Keeping dependencies up to date ensures that your application benefits from the latest security patches and bug fixes. ## 7. Concurrency and Parallelism (Fly.io Specific) Fly.io's architecture allows for easy scaling and distribution of your application across multiple regions. Consequently, attention to concurrency and parallelism is especially important ### 7.1 Region Awareness **Do This:** Design your application to be aware of the region it's running in. Utilize the "FLY_REGION" environment variable to customize behavior based on region. **Example (Python):** """python import os def get_database_connection_string(): region = os.environ.get("FLY_REGION") if region == "ams": return "postgres://ams_db" #Amsterdam DB elif region == "sfo": return "postgres://sfo_db" #San Francisco DB else: return "postgres://default_db" """ **Why:** Region-aware applications can optimize for latency and data locality, improving performance and user experience. ### 7.2 Handling Global State **Do This:** Avoid relying on global state. If global state is necessary, use a distributed caching system like Redis or Memcached, readily available on Fly.io. **Don't Do This:** Store critical state in-memory on a single VM. **Why:** Since Fly.io applications are often distributed across multiple VMs, relying on local in-memory state can lead to inconsistencies and data loss. ### 7.3 Database Connections **Do This:** Use connection pooling to manage database connections efficiently. Properly configure connection limits to avoid exhausting resources. **Don't Do This:** Create a new database connection for every request. **Why:** Connection pooling reduces the overhead of establishing new database connections, improving performance and scalability. ### 7.4 Background Tasks and Queues **Do This:** Offload long-running or resource-intensive tasks to background queues using services like Redis Queue, Celery or similar async task management tools. This helps maintain responsiveness of your web application. **Don't Do This:** Execute long-running tasks directly within request handlers. **Why:** Tasks in request handlers can cause delays and degrade the user experience. Background queues allow you to process tasks asynchronously, without blocking the main application thread. This is particularly important in edge deployments like Fly.io. ## 8. Fly.io Platform Specific Conventions ### 8.1 "fly.toml" Configuration **Do This:** Maintain a well-structured and documented "fly.toml" file. Use comments to explain the purpose of each section and key. **Don't Do This:** Leave unnecessary or commented-out configuration options in "fly.toml". **Why:** The "fly.toml" file is the central configuration file for your Fly.io application, readability is paramount. """toml # fly.toml app configuration file app = "my-cool-app" primary_region = "sfo" # Primary region for deployment [build] # Build configuration builder = "dockerfile" dockerfile = "Dockerfile" [http_service] # HTTP service configuration internal_port = 8080 force_https = true auto_stop_machines = true auto_start_machines = true min_machines_running = 1 [http_service.concurrency] type = "requests" # Handles up to 100 requests simultaneously hard_limit = 100 soft_limit = 75 [[vm]] # VM Configuration cpu_kind = "shared" cpus = 1 memory_mb = 512 """ ### 8.2 Health Checks **Do This:** Implement robust health checks to ensure that Fly.io can properly monitor the health of your application instances using "fly.toml". Ensure health checks accurately reflect the state of the app (e.g. database connectivity) **Don't Do This:** Rely on simple HTTP status code checks that might not catch underlying issues. """toml [checks] [checks.status] port = 8080 protocol = "http" path = "/healthz" # Endpoint for health status timeout = "2s" interval = "10s" restart_limit = 3 [checks.database] # Example of a custom database check port = 8080 protocol = "tcp" timeout = "2s" interval = "15s" restart_limit = 3 """ **Why:** Robust health checks allow Fly.io to automatically restart unhealthy instances, ensuring high availability. ### 8.3 Logging **Do This:** Use structured logging (e.g., JSON format) to make logs easier to parse and analyze. Use appropriate log levels (e.g., DEBUG, INFO, WARN, ERROR) to indicate the severity of events. **Don't Do This:** Print unstructured log messages to standard output. **Example (Python):** """python import logging import json logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s') def process_data(data): logging.info(json.dumps({ "event": "data_received", "data": data })) try: # ... process data ... logging.debug(json.dumps({ "event": "data_processed", "status": "success" })) return True; except Exception as e: logging.error(json.dumps({ "event": "data_processing_failed", "error": str(e) })) return False; """ **Why:** Structured logging makes it easier to search, filter, and analyze logs, facilitating debugging and monitoring. ### 8.4 Using Fly Volumes **Do This:** Utilize Fly Volumes for persistent storage of data that needs to survive instance restarts. Understand the performance characteristics of Fly Volumes. **Don't Do This:** Store persistent data directly on the instance's ephemeral storage. **Why:** Fly Volumes provide persistent storage that is automatically replicated and managed by Fly.io, ensuring data durability. ### 8.5 Fly Machines API **Do This:** If your application requires more advanced control over your Fly.io deployments, leverage the Fly Machines API for custom orchestration and management. Be mindful of the API rate limits. **Why:** The Machines API is a low-level API granting precise control over Fly.io resources, enabling automation and specialized deployment strategies. This document provides a foundation for writing clean, maintainable, and secure code for the Fly.io platform. By adhering to these guidelines, developers can build robust and scalable applications that take full advantage of Fly.io's features. This guidance will serve as valuable context for AI coding assistants, helping ensure they generate code that aligns with best practices and Fly.io's unique environment.
# Tooling and Ecosystem Standards for Fly.io This document outlines coding standards focused on tooling and the Fly.io ecosystem, promoting maintainability, performance, and security. It's designed to guide developers and inform AI coding assistants. ## 1. Development Environment Setup ### 1.1 Fly CLI Tooling **Standard:** Use the latest version of the Fly CLI tool. **Do This:** Regularly update the Fly CLI: """bash flyctl version update """ **Don't Do This:** Use outdated Fly CLI versions. **Why:** Newer versions include bug fixes, performance improvements, and access to the latest Fly.io features. **Example:** Checking the current version and updating: """bash flyctl version # Expected Output (example): flyctl v0.2.21 darwin/arm64 Commit: 3a3... BuildDate: 2023-10-27T15:00:00Z flyctl version update """ ### 1.2 Editor Configuration **Standard:** Configure your editor for Fly.io project development. Use editorconfig or similar for standardized formatting. **Do This:** * Install language-specific extensions for syntax highlighting and linting. * Use editorconfig to define consistent indentation, line endings, and character encoding. * Integrate linters and formatters (e.g., ESLint, Prettier) via editor plugins. **Don't Do This:** Rely solely on default editor settings without language-specific configuration. **Why:** Consistent formatting increases code readability and reduces merge conflicts. Linting helps catch potential errors early. **Example:** ".editorconfig" file: """ini root = true [*] charset = utf-8 end_of_line = lf insert_final_newline = true trim_trailing_whitespace = true [*.py] indent_style = space indent_size = 4 [*.js] indent_style = space indent_size = 2 """ ### 1.3 Version Control (Git) **Standard:** Version control all Fly.io project code. **Do This:** * Use meaningful commit messages following conventional commits. * Use branches for feature development and bug fixes. * Use pull requests for code review. * Use ".gitignore" to exclude unnecessary files from version control (e.g., ".fly", "node_modules"). **Don't Do This:** Commit directly to the main branch. Ignore code review practices. Include secrets or credentials in the repository. **Why:** Version control is crucial for collaboration, code tracking, and rollbacks. **Example:** ".gitignore" File: """ .fly/ node_modules/ *.log .DS_Store """ ## 2. Dependency Management ### 2.1 Application Dependencies **Standard:** Use appropriate dependency management tools for your language/framework (e.g., "npm" for Node.js, "pip" for Python, "go mod" for Go). **Do This:** * Pin dependencies to specific versions or use version ranges (e.g., "~1.2.3" or "^1.2.3"). * Use lockfiles ("package-lock.json", "requirements.txt", "go.sum") to ensure reproducible builds. * Periodically update dependencies to address security vulnerabilities and bug fixes. * Use Dependabot (or similar tool) to automate dependency updates and vulnerability scanning. **Don't Do This:** Use wildcard versions ("*") or excessively broad version ranges. Neglect updating dependencies regularly. **Why:** Dependency management ensures consistent builds and mitigates the risk of vulnerabilities and breaking changes. **Example (Node.js package.json):** """json { "name": "my-fly-app", "version": "1.0.0", "dependencies": { "express": "^4.17.1", "node-fetch": "~2.6.1" }, "devDependencies": { "eslint": "^7.0.0" } } """ **Example (Python requirements.txt):** """ Flask==2.0.1 requests~=2.25.1 """ ### 2.2 Fly.io Specific Dependencies **Standard:** Use official or well-maintained community libraries where available for interacting with Fly.io services (e.g., distributed locks, leader election), rather than re-inventing the wheel. **Do This:** Explore the Fly.io documentation and community forums for relevant libraries. Evaluate libraries based on their maturity, maintainership, and compatibility with your application. **Don't Do This:** Use undocumented or abandoned libraries. Build core Fly.io platform integrations from scratch without researching the existing ecosystem. **Why:** Using community libraries accelerates development, leverages existing expertise, and promotes code reusability. ## 3. Logging and Monitoring ### 3.1 Centralized Logging **Standard:** Utilize a centralized logging system accessible from within Fly.io. **Do This:** * Integrate your application with a logging service (e.g., Grafana Loki, ELK stack, CloudWatch Logs, Datadog). * Structure logs in a standardized format (e.g., JSON) for easier parsing and analysis. * Include relevant context in log messages (e.g., request ID, user ID, instance ID). * Use environment variables to configure your logging backend to avoid hardcoding credentials: """python import logging import os logging_level = os.environ.get("LOGGING_LEVEL", "INFO").upper() logging.basicConfig(level=logging_level) logger = logging.getLogger(__name__) logger.info("Application started") """ * Use structured logging to facilitate querying and analysis, allowing you to aggregate logs by machine, application version, and severity: """python import logging import json logger = logging.getLogger(__name__) logger.setLevel(logging.INFO) def log_request(request): log_data = { "level": "info", "message": "Request received", "method": request.method, "path": request.path, "user_agent": request.user_agent.string } logger.info(json.dumps(log_data)) # Example usage in a Flask route from flask import Flask, request app = Flask(__name__) @app.route('/') def hello_world(): log_request(request) # Log the incoming request return 'Hello, World!' """ **Don't Do This:** Rely solely on console logging. Hardcode sensitive information in log messages. **Why:** Centralized logging makes it easier to troubleshoot issues and monitor application health across multiple Fly.io instances. **Example (simplified Python logging setup):** """python import logging import os logging.basicConfig(level=os.environ.get("LOG_LEVEL", "INFO")) logger = logging.getLogger(__name__) logger.info("Application starting...") """ ### 3.2 Health Checks **Standard:** Implement health checks for your application to detect and recover from failures. Configure these endponts in your "fly.toml" file. **Do This:** * Provide an endpoint that reports the application's health status (e.g., "/healthz"). * Include checks for critical dependencies (e.g., database connection, Redis connectivity). * Configure Fly.io health checks to use the health endpoint. * Ensure the health check endpoint returns appropriate HTTP status codes (e.g., 200 OK for healthy, 503 Service Unavailable for unhealthy). **Don't Do This:** Rely on basic ping checks. Neglect to monitor the health check status. **Why:** Health checks allow Fly.io to automatically restart unhealthy instances, improving application availability. **Example (basic health check endpoint in Flask):** """python from flask import Flask, jsonify app = Flask(__name__) @app.route("/healthz") def healthz(): # Add checks for database connection, etc. here return jsonify({"status": "ok"}), 200 """ **Example ("fly.toml" configuration):** """toml [http_service] internal_port = 8080 force_https = true auto_stop_machines = true auto_start_machines = true min_machines_running = 1 [http_service.checks] path = "/healthz" interval = "15s" timeout = "2s" grace_period = "5s" """ ### 3.3 Metrics and Monitoring **Standard:** Collect and monitor application metrics to gain insights into performance and resource utilization. Integrate with Prometheus. **Do This:** * Expose application metrics using a standard format (e.g., Prometheus exposition format). * Use a metrics collection and visualization tool (e.g., Prometheus, Grafana, Datadog). * Monitor key metrics such as CPU usage, memory usage, network traffic, and request latency. * Collect custom metrics that are relevant to your specific application and business logic. **Don't Do This:** Ignore application performance metrics. Fail to set up alerts for critical conditions. **Why:** Metrics and monitoring provide valuable insights into application behavior and allow you to identify and resolve performance bottlenecks. **Example (Exposing metrics in Python using Prometheus client library):** """python from prometheus_client import start_http_server, Summary import random import time # Create a metric to track time spent and requests made. REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request') # Decorate function with metric. @REQUEST_TIME.time() def process_request(t): """A dummy function that takes some time.""" time.sleep(t) if __name__ == '__main__': # Start up the server to expose the metrics. start_http_server(8000) # Generate some requests. while True: process_request(random.random()) """ ## 4. Secret Management ### 4.1 Fly.io Secrets Store **Standard:** Store secrets securely using Fly.io's built-in secrets store. **Do This:** * Use the "flyctl secrets" command to set secrets for your application: "flyctl secrets set MY_SECRET=myvalue". * Access secrets from your application using environment variables: "os.environ.get("MY_SECRET")". * Rotate secrets regularly. * Prefer machine secrets over app secrets when the secret needs to follow the life cycle of the machine, such as unique auth tokens that are generated per machine "flyctl machine secret set MY_MACHINE_SECRET=machine_secret --machine <machine_id>" **Don't Do This:** Hardcode secrets in your code or configuration files. Store secrets in version control. **Why:** Fly.io's secrets store provides a secure and convenient way to manage sensitive data. **Example (setting and accessing a secret):** """bash flyctl secrets set DATABASE_URL="postgres://user:password@host:port/database" # In your application code (Python): import os database_url = os.environ.get("DATABASE_URL") """ ### 4.2 Vault Integration **Standard:** For more complex secret management requirements, consider integrating with HashiCorp Vault. **Do This:** * Deploy a Vault instance within your Fly.io organization. * Configure your application to authenticate with Vault and retrieve secrets. * Use Vault's features, like dynamic secrets and secret leasing, to minimize risk. **Don't Do This:** Introduce Vault without a clear understanding of its concepts and best practices. **Why:** Vault provides advanced features for managing and securing secrets, such as audit logging, secret rotation, and access control. This applies for more sensitive secrets or compliance-sensitive workloads. ## 5. Deployment Pipelines ### 5.1 Automated Deployments **Standard:** Automate the deployment process using a CI/CD pipeline. **Do This:** * Use a CI/CD tool (e.g., GitHub Actions, GitLab CI, CircleCI) to build, test, and deploy your application. * Create a "fly.toml" file that defines your application's configuration. * Use the "flyctl deploy" command to deploy your application. * Integrate automated tests into your deployment pipeline. **Don't Do This:** Manually deploy your application from your local machine. Skip testing before deploying. **Why:** Automation reduces the risk of human error and allows you to deploy changes quickly and reliably. **Example (GitHub Actions workflow):** """yaml name: Deploy to Fly.io on: push: branches: - main jobs: deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - uses: superfly/flyctl-actions/setup-flyctl@master - run: flyctl deploy --remote-only env: FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }} """ ## 6. Local Development with Fly.io Resources ### 6.1 Use "flyctl dev console" **Standard:** Use flyctl's dev console to streamline usage of Fly.io resources during local development. **Do This:** * Start a dev console. * Utilize the dev console in conjunction with your other debugging, logging, and testing processes. """bash flyctl dev console """ ## 7. Fly.io Extensions: Community and Official ### 7.1 Utilizing Extensions **Standard:** Explore and leverage available Fly.io extensions to enhance functionality and streamline development processes. **Do This:** * Browse the available extensions to see if any are helpful to your deployment. * Use them with the "flyctl extensions install" command. * Use only extensions you understand and trust. **Don't Do This:** Blindly install extensions without fully understanding their functionality and security implications. Neglect to explore available extensions to streamline your deployment """bash flyctl extensions install <extension_name> """ ## 8. Data Persistence with Volumes ### 8.1 Volume Management **Standard:** Create and manage persistent volumes to retain data across deployments and machine restarts. **Do This:** * Create a directory for holding all volume-persistence related code. * Use the "flyctl volumes create" command for managing volumes. * Properly configure the "fly.toml" file to mount these, following best practices like creating functions which automatically map to the volume in your app, instead of hardcoding volume names in your code. * Ensure proper backup and restore strategies for data stored on volumes. **Don't Do This:** Store critical data directly on the instance's local disk. Neglect to backup data stored on volumes. """bash flyctl volumes create data_volume --size 1 #size in gb """ """toml [mounts] source="data_volume" destination="/data" """ ## 9. Resource Optimization ### 9.1 Fly Machines Resource Allocation **Standard:** Optimize resource allocation for Fly Machines to minimize costs and improve performance. **Do This:** * Right-size instances based on application requirements. * Use autoscaling to dynamically adjust the number of running instances based on load. * Monitor resource utilization and adjust instance sizes as needed. **Don't Do This:** Over-provision resources. Neglect to monitor resource utilization. **Why:** Proper resource allocation can significantly reduce costs and improve application performance. **Example ("fly.toml" Autoscaling config):** """toml [processes] app = "python app.py" [deploy] auto_rollback = true [http_service] internal_port = 8080 min_machines_running = 1 processes = ["app"] [http_service.concurrency] hard_limit = 25 soft_limit = 20 type = "requests" [http_service.checks] interval = "15s" timeout = "2s" grace_period = "5s" restart_limit=3 [scale] # these are request-based, not CPU-based min_machines = 1 max_machines = 4 """