# Core Architecture Standards for MongoDB

This document outlines the core architectural standards to be followed when developing and maintaining MongoDB applications. These standards are designed to promote maintainability, performance, security, and scalability, leveraging the latest features and best practices of MongoDB. Following these guidelines will ensure consistency across projects and facilitate collaboration, especially when using AI coding assistants.

## 1. Overall Architectural Principles

### 1.1 Monorepo vs. Polyrepo

**Standard:** Favor a monorepo structure for tightly coupled microservices or components within a single product or domain. Use polyrepos for independent services or libraries with less frequent interaction.

* **Do This:** Implement a monorepo if your application consists of several microservices that frequently interact and are deployed together.

* **Don't Do This:** Use a polyrepo if services have intricate dependencies managed and released together.

**Why:**

* **Monorepo:** Simplifies dependency management, code reuse, and coordinated refactoring. Facilitates atomic changes across multiple components.

* **Polyrepo:** Provides clear ownership and isolation for independent components, reducing the risk of unintended side effects during development.

**Example:**

* Monorepo Structure (Example for a social media app):

"""

├── services/

│ ├── user-service/

│ │ ├── src/

│ │ ├── Dockerfile

│ ├── post-service/

│ │ ├── src/

│ │ ├── Dockerfile

│ ├── notification-service/

│ │ ├── src/

│ │ ├── Dockerfile

├── libs/

│ ├── common-utils/

│ │ ├── src/

"""

* Polyrepo Structure (Three independent repositories):

* "user-service" repository

* "post-service" repository

* "notification-service" repository

### 1.2 Layered Architecture

**Standard:** Structure applications into well-defined layers (e.g., presentation, application/service, domain/business logic, data access/persistence).

* **Do This:** Separate concerns by clearly defining the responsibility of each layer. Use dependency injection to promote loose coupling.

* **Don't Do This:** Create monolithic blocks of code that mix presentation logic with database interactions.

**Why:**

Layered architecture enhances maintainability, testability, and reusability. Changes in one layer have minimal impact on other layers.

**Example:**

"""python

# data_access_layer.py

from pymongo import MongoClient

class UserRepository:

def __init__(self, connection_string, database_name):

self.client = MongoClient(connection_string)

self.db = self.client[database_name]

self.users = self.db.users

def get_user_by_id(self, user_id):

return self.users.find_one({"_id": user_id})

# business_logic_layer.py

class UserService:

def __init__(self, user_repository):

self.user_repository = user_repository

def get_user_profile(self, user_id):

user = self.user_repository.get_user_by_id(user_id)

if user:

return {

"user_id": str(user["_id"]),

"username": user["username"],

"email": user["email"]

}

else:

return None

# presentation_layer.py (e.g., Flask route)

from flask import Flask, jsonify

# Assuming data_access_layer and business_logic_layer are in the same dir

from data_access_layer import UserRepository

from business_logic_layer import UserService

app = Flask(__name__)

# Configuration (replace with your actual values)

CONNECTION_STRING = "mongodb://localhost:27017/"

DATABASE_NAME = "mydatabase"

user_repository = UserRepository(CONNECTION_STRING, DATABASE_NAME)

user_service = UserService(user_repository)

@app.route("/users/", methods=["GET"])

def get_user(user_id):

user_profile = user_service.get_user_profile(user_id)

if user_profile:

return jsonify(user_profile)

else:

return jsonify({"message": "User not found"}), 404

if __name__ == "__main__":

app.run(debug=True)

"""

### 1.3 Modular Design

**Standard:** Decompose the system into independent, reusable modules.

* **Do This:** Create modules with well-defined interfaces and minimal dependencies on other modules.

* **Don't Do This:** Build tightly coupled modules with extensive shared state.

**Why:**

Modularity promotes code reuse, reduces complexity, and simplifies maintenance.

**Example:**

* Modular Python structure:

"""

├── modules/

│ ├── authentication/

│ │ ├── __init__.py

│ │ ├── auth_service.py

│ │ ├── auth_repository.py

│ ├── user_management/

│ │ ├── __init__.py

│ │ ├── user_service.py

│ │ ├── user_repository.py

"""

### 1.4 Event-Driven Architecture

**Standard:** Employ event-driven architecture for decoupled communication between services, using message queues or similar mechanisms.

* **Do This:** Use message queues (RabbitMQ, Kafka, or MongoDB change streams) to asynchronously communicate between services.

* **Don't Do This:** Rely on direct synchronous calls between services that create tight coupling.

**Why:**

Event-driven architecture enables scalability, fault tolerance, and flexible integration.

**Example:**

"""python

# Producer (User Service)

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")

db = client["mydatabase"]

users = db["users"]

def create_user(user_data):

result = users.insert_one(user_data)

user_id = str(result.inserted_id)

# Simulate sending an event

print(f"UserCreated Event: User ID - {user_id}") # In real use-case, publish to a message queue

return user_id

# Consumer (Notification Service - simulates consuming a change event)

def handle_user_created_event(user_id):

print(f"Notification Service: Sending welcome email to user with ID: {user_id}")

# Simulate a user creation

user_id = create_user({"username": "testuser", "email": "test@example.com"})

# Simulate the consumption of the event. This would actually be triggered async by the queue/change stream

handle_user_created_event(user_id)

# Example using MongoDB Change Streams as a consumer

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")

db = client["mydatabase"]

users = db["users"]

# Start a change stream

with users.watch() as stream:

for change in stream:

if change['operationType'] == 'insert':

user_id = str(change['documentKey']['_id'])

print(f"Received insert operation for user ID: {user_id}")

# Send welcome email to the user.

print(f"Simulating email to user: {user_id}")

# The "with users.watch() as stream:" block maintains the connection open

# and continues to listen for changes indefinitely unless interrupted or encounter an unrecoverable error

# Simulate user creation in another terminal or through a different process to trigger change stream events.

"""

### 1.5 Microservices Architecture Considerations

**Standard:** Favor Microservices (with Bounded Contexts) for isolating failure domains, scaling specific features independently, and enabling autonomous team development.

* **Do This:** Design microservices with clear boundaries based on business domains. Employ lightweight communication protocols like REST or gRPC.

* **Don't Do This:** Create large, monolithic services that combine multiple unrelated functions, defeating the purpose of the microservices approach.

**Why:**

Microservices enable independent deployment, scaling, and technology choices for each service.

**Example:**

* Microservice Architecture (Ordering Service Example):

"""

├── order-service/

│ ├── src/

│ ├── Dockerfile

│ ├── api.py # REST endpoints for order management

│ ├── models.py # Order-related data models

│ ├── order_processor.py # Business logic for order processing

│ ├── requirements.txt

"""

Each microservice would have its own corresponding repository and deployment pipeline. The "order-service" might interact with "payment-service" and "shipping-service" via REST APIs or message queues.

## 2. MongoDB Specific Architecture

### 2.1 Schema Design

**Standard:** Schema design should prioritize query patterns. Embed related data when read together frequently. Use references for less frequently accessed data.

* **Do This:** Embed addresses within customer documents for frequently accessed information. Reference product details in order documents.

* **Don't Do This:** Normalize data to the extreme, causing unnecessary joins. Embed excessive array data leading to document growth issues.

**Why:**

Optimized schema design is crucial for MongoDB performance, minimizing disk I/O and network traffic.

**Example:**

"""json

// Embedded:

{

"_id": ObjectId("..."),

"customer_name": "John Doe",

"address": {

"street": "123 Main St",

"city": "Anytown",

"zip": "12345"

"orders": [ //Keep "orders" array reasonably small.

{

"order_id": ObjectId("..."),

"product_id": ObjectId("..."),

"quantity": 2

}

]

}

// Referenced:

{

"_id": ObjectId("..."),

"customer_name": "John Doe",

"address_id": ObjectId("...") // Reference to address document

}

// Separate Address document:

{

"_id": ObjectId("..."),

"street": "123 Main St",

"city": "Anytown",

"zip": "12345"

}

"""

### 2.2 Data Modeling Patterns

**Standard:** Utilize data modeling patterns such as the Polymorphic pattern, Attribute pattern, and Bucket pattern to optimize data storage and retrieval.

* **Do This:** Use the Polymorphic pattern to store different types of products within the same collection. Employ the Bucket pattern to group time-series data into manageable chunks.

* **Don't Do This:** Avoid using modeling patterns inappropriately, such as applying the Bucket pattern to non-time-series data.

**Why:**

Data modeling patterns improve query efficiency and accommodate evolving data structures.

**Examples:**

* **Polymorphic Pattern:**

"""json

// Products Collection

[

{

"_id": ObjectId("654321abbced123..."),

"productType": "Book",

"title": "The MongoDB Handbook",

"author": "John Doe",

"isbn": "123-4567890"

{

"_id": ObjectId("987654zyxwvu321..."),

"productType": "DVD",

"title": "MongoDB for Beginners",

"director": "Jane Smith",

"runtime": 120

}

]

"""

* **Bucket Pattern (Time Series data for sensor readings)**

"""json

// Reads Collection with the "bucket" field

[

{

"_id": ObjectId(),

"bucket": "2023-11-01",

"sensorId": "sensor123",

"readings": [

{ "timestamp": ISODate("2023-11-01T10:00:00Z"), "value": 22.5 },

{ "timestamp": ISODate("2023-11-01T10:01:00Z"), "value": 22.6 }

]

{

"_id": ObjectId(),

"bucket": "2023-11-02",

"sensorId": "sensor123",

"readings": [

{ "timestamp": ISODate("2023-11-02T10:00:00Z"), "value": 22.7 },

{ "timestamp": ISODate("2023-11-02T10:01:00Z"), "value": 22.8 }

]

}

]

"""

### 2.3 Indexing Strategy

**Standard:** Create indexes to support common query patterns and optimize performance. Follow the ESR (Equality, Sort, Range) rule when defining compound indexes.

* **Do This:** Create indexes on fields used in "find()", "sort()", and range queries. Consider using compound indexes when filtering on multiple fields.

* **Don't Do This:** Over-index collections which can degrade write performance. Create indexes on fields that are rarely queried.

**Why:**

Proper indexing significantly reduces query latency and resource consumption.

**Examples:**

"""javascript

// Single field index

db.collection.createIndex( { "field1": 1 } )

// Compound index (ESR rule)

db.collection.createIndex( { "equalityField": 1, "sortField": 1, "rangeField": 1 } )

// Text index

db.collection.createIndex( { "field1": "text" } )

"""

### 2.4 Aggregation Pipeline

**Standard:** Leverage the aggregation pipeline for complex data transformations and reporting tasks. Optimize pipelines by using indexes efficiently.

* **Do This:** Use "$match" early in the pipeline to reduce the amount of data processed. Utilize "$project" to reshape documents and remove unnecessary fields.

* **Don't Do This:** Run complex aggregations without considering performance. Avoid using "$lookup" excessively, which can be slow for large datasets (consider denormalization instead).

**Why:**

The aggregation pipeline provides powerful data processing capabilities directly within MongoDB.

**Example:**

"""javascript

db.orders.aggregate([

{

$match: { //Stage 1: Filter using an index

"status": "active",

"order_date": { $gte: ISODate("2023-01-01T00:00:00Z") }

}

{

$lookup: { // Stage 2: Join with products collection. Use an index on "products.product_id"

from: "products",

localField: "product_id",

foreignField: "_id",

as: "product"

}

{

$unwind: "$product" //Stage 3: Deconstruct the product array

{

$group: { // Stage 4: Group by customer and sum the order values

_id: "$customer_id",

total_spent: { $sum: { $multiply: [ "$product.price", "$quantity" ] } }

}

{

$sort: { total_spent: -1 } //Stage 5: Sort by total spent

}

])

"""

### 2.5 Change Streams

**Standard:** Effectively utilize change streams to react to real-time data changes and build reactive applications. Configure streams according to your application's needs.

* **Do This:** Use change streams for auditing, real-time analytics, and triggering notifications based on data modifications. Filter events to reduce overhead.

* **Don't Do This:** Neglect error handling within the change stream listener. Overload the change stream with unnecessary event processing.

**Example:**

"""python

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")

db = client["mydatabase"]

collection = db["mycollection"]

with collection.watch() as stream:

for change in stream:

print(f"Change detected: {change}")

# Process the change event (e.g., update a cache, send a notification)

#To watch only specific events

resume_token = None #start from the current point

try:

with collection.watch(resume_after=resume_token, full_document='updateLookup') as stream:

for change in stream:

resume_token = stream.resume_token

if change['operationType'] == 'update':

print(f"Partial Update detected: {change['updateDescription']}")

elif change['operationType'] == 'insert':

print(f"Insert detected: {change['fullDocument']}")

except Exception as e:

print(f"Change stream error: {e}")

"""

### 2.6 Transactions

**Standard:** Use transactions when atomicity is required across multiple operations or documents. Design schemas to minimize the need for complex transactions.

* **Do This:** Use multi-document transactions to ensure data consistency in critical operations, such as transferring funds between accounts.

* **Don't Do This:** Overuse transactions, which can impact performance. Avoid long-running transactions that hold locks for extended periods.

**Example:**

"""python

from pymongo import MongoClient, TransactionOptions

client = MongoClient("mongodb://localhost:27017/")

db = client["mydatabase"]

accounts = db["accounts"]

def transfer_funds(from_account_id, to_account_id, amount):

with client.start_session() as session:

def callback(session):

from_account = accounts.find_one({"_id": from_account_id}, session=session)

to_account = accounts.find_one({"_id": to_account_id}, session=session)

if not from_account or not to_account or from_account["balance"] < amount:

raise ValueError("Insufficient funds or invalid accounts")

accounts.update_one({"_id": from_account_id}, {"$inc": {"balance": -amount}}, session=session)

accounts.update_one({"_id": to_account_id}, {"$inc": {"balance": amount}}, session=session)

return True # Indicate success

try:

session.with_transaction(callback, read_concern=ReadConcern('snapshot'), write_concern=WriteConcern('majority'))

print("Transaction completed successfully.")

except Exception as e:

print(f"Transaction failed: {e}")

# Example Usage

transfer_funds("account1", "account2", 100)

"""

## 3. Technology Stack & Tooling

### 3.1 ODM/ORM Libraries

**Standard:** Utilize ODM/ORM libraries like Mongoose (Node.js), MongoEngine (Python) or Morphia (Java) but understand their performance implications.

* **Do This:** Use these libraries to simplify data validation, schema management, and object mapping.

* **Don't Do This:** Neglect performance optimization. Be mindful of how the ORM translates queries into MongoDB operations.

**Why:** These simplify interactions with MongoDB and promote structured coding.

**Example (Mongoose - Javascript/Node.js)**:

"""javascript

const mongoose = require('mongoose');

// Define a schema

const userSchema = new mongoose.Schema({

username: { type: String, required: true, unique: true },

email: { type: String, required: true },

age: { type: Number, min: 18, max: 120 }

});

// Create a model from the schema

const User = mongoose.model('User', userSchema);

// Example usage

const newUser = new User({

username: 'johndoe',

email: 'john.doe@example.com',

age: 30

});

newUser.save()

.then(() => console.log('User created'))

.catch(err => console.error(err));

"""

### 3.2 Connection Pooling

**Standard:** Implement connection pooling for efficient database access.

* **Do This:** Configure connection pooling in your MongoDB driver to reuse connections and reduce overhead. Control the maximum and minimum pool sizes.

* **Don't Do This:** Open and close connections frequently, which can drain resources and slow down performance.

**Why:**

Connection pooling minimizes overhead and utilizes efficient performance.

**Example (Python with PyMongo):**

"""python

from pymongo import MongoClient

# Configure connection pooling

client = MongoClient("mongodb://localhost:27017/",

maxPoolSize=50,

minPoolSize=10)

db = client["mydatabase"]

collection = db["mycollection"]

# The client automatically manages the connection pool

"""

### 3.3 Monitoring and Logging

**Standard:** Implement robust monitoring, logging and tracing solutions.

* **Do This:** Use MongoDB Atlas, or tools like Grafana, Prometheus or ELK stack for monitoring key metrics. Include detailed logging in your services to track errors and performance.

* **Don't Do This:** Ignore database performance metrics. Fail to log errors or slow queries.

**Why:**

Monitoring allows you to pinpoint potential performance issues or security problems.

**Example (Logging slow queries - Javascript/Node.js):**

"""javascript

// Enable profiler to log slow queries (for development/debugging - use with caution on production)

db.setProfilingLevel(1, 100); // Log queries slower than 100ms

// Retrieve slow queries

db.system.profile.find({ millis : { $gt : 100 } }).sort( { ts : -1 } ).limit( 10 )

// Proper logging using a library like Winston/Bunyan is recommended

const logger = require('winston');

logger.log('info', 'Query executed', { query: 'db.collection.find({})', duration: 120 });

"""

These standards provide a robust foundation for building reliable, scalable, and secure MongoDB applications, improving code clarity, and facilitating easier maintenance by development teams. They should be enforced via code reviews, automated linters, and regular training. AI tools should be configured to adhere to these standards.

Cline

This guide explains how to effectively use .clinerules with Cline, the AI-powered coding assistant.

Overview

The .clinerules file is a powerful configuration file that helps Cline understand your project's requirements, coding standards, and constraints. When placed in your project's root directory, it automatically guides Cline's behavior and ensures consistency across your codebase.

Key Concepts

Purpose of .clinerules

Defines project-specific guidelines and requirements
Enforces consistent coding standards
Establishes documentation practices
Sets testing and quality requirements
Configures error handling preferences

File Location

Place the .clinerules file in your project's root directory. Cline automatically detects and follows these rules for all files within the project.

Rule Structure

1. Project Overview

# Project Overview
project:
  name: 'Your Project Name'
  description: 'Brief project description'
  stack:
    - technology: 'Framework/Language'
      version: 'X.Y.Z'
    - technology: 'Database'
      version: 'X.Y.Z'

2. Code Standards

# Code Standards
standards:
  style:
    - 'Use consistent indentation (2 spaces)'
    - 'Follow language-specific naming conventions'
  documentation:
    - 'Include JSDoc comments for all functions'
    - 'Maintain up-to-date README files'
  testing:
    - 'Write unit tests for all new features'
    - 'Maintain minimum 80% code coverage'

3. Security Rules

# Security Guidelines
security:
  authentication:
    - 'Implement proper token validation'
    - 'Use environment variables for secrets'
  dataProtection:
    - 'Sanitize all user inputs'
    - 'Implement proper error handling'

Best Practices

Writing Effective Rules

Be Specific
- Use clear, actionable language
- Provide examples where helpful
- Define measurable criteria
Maintain Organization
- Group related rules together
- Use consistent formatting
- Keep critical rules at the top
Regular Updates
- Review rules periodically
- Update based on team feedback
- Document changes in version control

Common Patterns

# Common Patterns Example
patterns:
  components:
    - pattern: 'Use functional components by default'
    - pattern: 'Implement error boundaries for component trees'
  stateManagement:
    - pattern: 'Use React Query for server state'
    - pattern: 'Implement proper loading states'

Integration with Development Workflow

Using with Version Control

Commit the Rules
- Include .clinerules in version control
- Document rule changes in commit messages
- Review rule changes as part of PR process
Team Collaboration
- Discuss rule changes with team
- Maintain changelog for rule updates
- Ensure all team members understand rules

Troubleshooting

Common Issues

Rules Not Being Applied
- Verify file location (must be in root directory)
- Check file formatting
- Ensure Cline has access to the file
Conflicting Rules
- Review rule hierarchy
- Resolve conflicts explicitly
- Document rule precedence
Performance Considerations
- Keep rules concise and focused
- Avoid overly complex rule structures
- Regular cleanup of obsolete rules

Examples

Basic Project Setup

# Basic .clinerules Example
project:
  name: 'Web Application'
  type: 'Next.js Frontend'
  standards:
    - 'Use TypeScript for all new code'
    - 'Follow React best practices'
    - 'Implement proper error handling'

testing:
  unit:
    - 'Jest for unit tests'
    - 'React Testing Library for components'
  e2e:
    - 'Cypress for end-to-end testing'

documentation:
  required:
    - 'README.md in each major directory'
    - 'JSDoc comments for public APIs'
    - 'Changelog updates for all changes'

Advanced Configuration

# Advanced .clinerules Example
project:
  name: 'Enterprise Application'
  compliance:
    - 'GDPR requirements'
    - 'WCAG 2.1 AA accessibility'

architecture:
  patterns:
    - 'Clean Architecture principles'
    - 'Domain-Driven Design concepts'

security:
  requirements:
    - 'OAuth 2.0 authentication'
    - 'Rate limiting on all APIs'
    - 'Input validation with Zod'

Core Architecture Standards for MongoDB

Cline

Overview

Key Concepts

Purpose of .clinerules

File Location

Rule Structure

1. Project Overview

2. Code Standards

3. Security Rules

Best Practices

Writing Effective Rules

Common Patterns

Integration with Development Workflow

Using with Version Control

Troubleshooting

Common Issues

Examples

Basic Project Setup

Advanced Configuration

Related Rules

State Management Standards for MongoDB

Performance Optimization Standards for MongoDB

Security Best Practices Standards for MongoDB

Component Design Standards for MongoDB

Testing Methodologies Standards for MongoDB