# Tooling and Ecosystem Standards for LLVM
This document outlines the recommended tooling and ecosystem standards for LLVM development. Adhering to these standards ensures a consistent, maintainable, and performant codebase across the LLVM project. These guidelines aim to improve developer productivity, code quality, and collaboration within the LLVM community.
## 1. Development Environment and Build System
### 1.1. Recommended Development Environment
* **Do This:** Use a standardized development environment to ensure consistency across platforms and teams. Consider using Docker containers or similar virtualization technologies to encapsulate dependencies.
* **Don't Do This:** Rely on ad-hoc setups that are difficult to reproduce and may lead to environment-specific bugs.
**Why:** Consistency reduces "works on my machine" issues and simplifies collaboration.
**Example:**
"""dockerfile
# Dockerfile for LLVM development
FROM ubuntu:latest
# Install dependencies
RUN apt-get update && apt-get install -y \
build-essential \
cmake \
git \
python3 \
ninja-build \
clang \
lld \
libtinfo-dev
# Set up LLVM source directory
WORKDIR /llvm
# Clone LLVM (replace with specific branch/version if needed)
RUN git clone https://github.com/llvm/llvm-project.git
# Set environment variables
ENV LLVM_SRC=/llvm/llvm-project
ENV PATH="$PATH:/llvm/llvm-project/build/bin"
# Optional: Install additional tools like clang-tools-extra
# RUN apt-get install -y clang-tools-extra
# Build LLVM (example)
WORKDIR /llvm/llvm-project/build
RUN cmake -G Ninja -DLLVM_ENABLE_PROJECTS="clang;lld" -DCMAKE_BUILD_TYPE=Release ../llvm
RUN ninja
"""
### 1.2. Build System - CMake
* **Do This:** Utilize the CMake build system provided by LLVM. Follow the standard CMake practices for defining targets, dependencies, and build options.
* **Don't Do This:** Resort to custom build scripts or Makefile hacks that bypass the CMake infrastructure.
**Why:** CMake provides a portable, efficient, and well-supported build system that integrates seamlessly with LLVM's tooling.
**Example:**
"""cmake
# CMakeLists.txt for a new LLVM tool
cmake_minimum_required(VERSION 3.13) # Use at least CMake 3.13
project(MyNewTool)
# Find LLVM
set(LLVM_DIR /path/to/llvm/build/lib/cmake/llvm) # Replace with your LLVM build directory.
find_package(LLVM REQUIRED CONFIG)
# Add source files
add_executable(MyNewTool MyNewTool.cpp)
# Link against LLVM libraries
target_link_libraries(MyNewTool LLVMSupport LLVMCore) # Add more libraries as needed.
# Install the tool
install(TARGETS MyNewTool DESTINATION bin)
"""
### 1.3. Build Types
* **Do This:** Use the standard CMake build types: "Debug", "Release", "RelWithDebInfo", and "MinSizeRel". Use "Debug" builds for development and debugging. Use "Release", "RelWithDebInfo", or "MinSizeRel" when profiling or deploying.
* **Don't Do This:** Create custom, non-standard build types without a clear justification, or mix debug and release flags manually.
**Why:** Standard build types are optimized for specific scenarios. Debug builds provide symbolic information for debugging, and release builds are optimized for performance.
### 1.4. Ninja Build System
* **Do This:** Use Ninja as a CMake generator ("cmake -G Ninja ...").
* **Don't Do This:** Rely solely on Makefiles (unless there's a specific reason).
**Why:** Ninja generally provides faster build times compared to Makefiles, especially for large projects like LLVM.
### 1.5. Ccache or similar caching tools
* **Do This:** Use "ccache" or "sccache" to significantly reduce compilation times, especially in CI environments.
* **Don't Do This:** Neglect to configure or utilize these tools when doing frequent builds.
**Why:** These tools improve iterative development speed by caching and reusing compilation results.
"""bash
#Example integration (assuming ccache is installed)
export CCACHE_DIR=/path/to/ccache
export CCACHE_MAXSIZE=10G
cmake -G Ninja -DCMAKE_CXX_COMPILER_LAUNCHER=ccache ...
"""
## 2. LLVM Libraries and Tools
### 2.1. Utilizing LLVM Support Libraries
* **Do This:** Leverage LLVM's comprehensive support libraries for common tasks like string manipulation ("llvm::StringRef", "llvm::Twine"), data structures ("llvm::SmallVector", "llvm::DenseMap"), and file system access ("llvm::sys::fs").
* **Don't Do This:** Re-implement functionality already provided by LLVM's support libraries. Avoid using "std::string" where "llvm::StringRef" is more appropriate (read-only string access).
**Why:** LLVM support libraries are highly optimized and integrated within the LLVM ecosystem. They also promote code reuse and consistency.
**Example:**
"""c++
#include "llvm/Support/raw_ostream.h"
#include "llvm/ADT/StringRef.h"
void printMessage(llvm::StringRef Message) {
llvm::outs() << "Message: " << Message << "\n";
}
int main() {
const char* text = "Hello, LLVM!";
llvm::StringRef message(text);
printMessage(message);
return 0;
}
"""
### 2.2. Using LLVM's Diagnostics Infrastructure
* **Do This:** Employ LLVM's diagnostic reporting mechanism ("llvm::DiagnosticInfo", "llvm::DiagnosticPrinter", "llvm::DiagnosticHandler") to issue errors, warnings, and remarks. This is especially true when developing compiler passes.
* **Don't Do This:** Use raw "fprintf" or "std::cerr" for diagnostic output, as this bypasses LLVM's structured error handling.
**Why:** LLVM's diagnostic system provides a unified way to report diagnostic information, enabling better integration with IDEs and tools.
**Example:**
"""c++
#include "llvm/Support/raw_ostream.h"
#include "llvm/IR/DiagnosticInfo.h"
#include "llvm/IR/DiagnosticPrinter.h"
#include "llvm/IR/LLVMContext.h"
class MyDiagnosticInfo : public llvm::DiagnosticInfo {
public:
MyDiagnosticInfo(llvm::StringRef Message, llvm::DiagnosticSeverity Severity)
: llvm::DiagnosticInfo(DS_Remark), Message(Message), Severity(Severity){}
void print(llvm::DiagnosticPrinter &DP) const override {
DP << "MyCustomTool: " << Message;
}
llvm::DiagnosticSeverity getSeverity() const override { return Severity; }
private:
llvm::StringRef Message;
llvm::DiagnosticSeverity Severity;
};
void emitDiagnostic(llvm::LLVMContext &Context, llvm::StringRef Message, llvm::DiagnosticSeverity Severity) {
Context.diagnose(MyDiagnosticInfo(Message, Severity));
}
int main() {
llvm::LLVMContext Context;
Context.setDiagnosticHandler([](const llvm::DiagnosticInfo &DI, void *Context) {
llvm::DiagnosticPrinterRawOStream DP(llvm::errs());
DI.print(DP);
llvm::errs() << '\n';
if (DI.getSeverity() == llvm::DS_Error)
exit(1);
}, nullptr);
emitDiagnostic(Context, "This is a warning!", llvm::DS_Warning);
emitDiagnostic(Context, "This is an error!", llvm::DS_Error);
return 0;
}
"""
### 2.3. TableGen
* **Do This:** Use TableGen to describe declarative information such as instruction sets, register definitions, and code patterns. Define data in ".td" files and generate C++ code using TableGen tools.
* **Don't Do This:** Hardcode these data directly in C++. Avoid manual modifications of generated code.
**Why:** TableGen allows you to describe data in a structured, declarative way, which makes it easier to maintain and extend.
**Example:**
"""tablegen
// Example .td file
class MyInstruction pattern> : Instruction {
string OpCodeStr = opCodeStr;
list Pattern = pattern;
let Namespace = "MY";
}
def MyADD : MyInstruction<"add", [(add i32:$src1, i32:$src2)]>;
"""
Then, use the appropriate TableGen backend to generate C++ code based on this definition.
### 2.4. Versioning and Compatibility
* **Do This:** Follow LLVM's versioning scheme meticulously. Use compatibility macros and conditional compilation to ensure that your code can be compiled with older or newer versions of LLVM. Consult the LLVM release notes for API changes and deprecations.
* **Don't Do This:** Assume that the LLVM API will remain stable across releases.
**Why:** LLVM evolves rapidly, and maintaining compatibility is crucial for long-term project health.
**Example:**
"""c++
#include "llvm/Support/raw_ostream.h"
#if LLVM_VERSION_MAJOR >= 16 //Example version check
#include "llvm/NewHeader.h"
#define NEW_API_AVAILABLE 1
#else
#define NEW_API_AVAILABLE 0
#endif
void myFunction() {
#if NEW_API_AVAILABLE
llvm::outs() << "Using the new API.\n";
#else
llvm::outs() << "Using the old API.\n";
#endif
}
"""
## 3. Coding Practices and Conventions
### 3.1. Code Formatting and Style
* **Do This:** Adhere strictly to the LLVM coding style guidelines. Use "clang-format" to automatically format your code. Configure your editor to run "clang-format" on save.
* **Don't Do This:** Deviate from the established coding style.
**Why:** Consistent code formatting improves readability and reduces merge conflicts.
**Example:**
To format your code, use the following command:
"""bash
clang-format -i MyFile.cpp
"""
The LLVM coding style guide can be found at: [https://llvm.org/docs/CodingStandards.html](https://llvm.org/docs/CodingStandards.html)
### 3.2. Code Reviews
* **Do This:** Submit your code for review using Phabricator (the LLVM code review tool). Provide clear explanations of your changes and address reviewer feedback promptly and thoroughly.
* **Don't Do This:** Bypass the code review process.
**Why:** Code reviews help catch errors, improve code quality, and disseminate knowledge.
### 3.3. Testing
* **Do This:** Write comprehensive unit tests and integration tests for your code. Use LLVM's lit testing framework. Add new tests for bug fixes and new features.
* **Don't Do This:** Neglect testing or submit code with inadequate test coverage.
**Why:** Thorough testing is essential to ensure the correctness and stability of LLVM.
**Example:**
Create a "test/MyTest.ll" file:
"""llvm
; RUN: FileCheck %s < %s
define i32 @main() {
; CHECK: Hello, LLVM!
call void @printMessage()
ret i32 0
}
declare void @printMessage()
"""
Create a "MyTest.cpp" driver that defines "@printMessage" and runs the LLVM IR code above. Then, create suitable "CMakeLists.txt" file to link the driver and the tests. Then, run "lit" in the build directory to execute the test.
### 3.4. Documentation
* **Do This:** Document your code clearly and concisely using Doxygen-style comments. Provide high-level documentation for public APIs and data structures. Update documentation when you change the code.
* **Don't Do This:** Neglect documentation or write ambiguous or outdated documentation.
**Why:** Good documentation makes it easier for others (and your future self) to understand and maintain your code.
**Example:**
"""c++
/**
* @brief This function calculates the sum of two integers.
*
* @param A The first integer.
* @param B The second integer.
* @return The sum of A and B.
*/
int add(int A, int B) {
return A + B;
}
"""
### 3.5. Memory Management
* **Do This:** Use smart pointers ("std::unique_ptr", "std::shared_ptr") or LLVM's "BumpPtrAllocator" for memory management. Adhere to RAII principles. When using "BumpPtrAllocator", allocate memory in arenas and avoid manual "delete" calls.
* **Don't Do This:** Use raw pointers and manual "new"/"delete" without a clear understanding of ownership semantics.
**Why:** Proper memory management prevents memory leaks and dangling pointers.
**Example:**
"""c++
#include "llvm/Support/Allocator.h"
#include
void processData() {
llvm::BumpPtrAllocator Allocator;
std::unique_ptr data(new (Allocator.Allocate(sizeof(int) * 10)) int[10]); // Allocate using BumpPtrAllocator
for (int i = 0; i < 10; ++i) {
data[i] = i * 2;
}
// Data will be automatically deallocated when Allocator goes out of scope.
}
"""
### 3.6 Exception Safety
* **Do This:** Design your code to be exception-safe. Ensure that resources are properly released in the presence of exceptions (using RAII, smart pointers, or try-catch blocks). Consider whether exceptions are even appropriate for your particular code. LLVM generally discourages the use of exceptions in performance-critical code.
* **Don't Do This:** Write code that leaks resources or corrupts data structures if an exception is thrown.
**Why:** Exception safety prevents unexpected behavior and data corruption.
### 3.7 Concurrency and Thread Safety
* **Do This:** When writing multi-threaded code, use LLVM's threading utilities (e.g., "llvm::thread", "llvm::mutex", "llvm::atomic") or standard C++ threading primitives ("std::thread", "std::mutex", "std::atomic"). Ensure that your code is thread-safe by using proper locking and synchronization mechanisms. Consider using LLVM's parallel algorithms (e.g. "llvm::for_each") where applicable.
* **Don't Do This:** Introduce data races or deadlocks in multi-threaded code. Use global mutable variables without proper synchronization.
**Why:** Concurrency bugs can be difficult to detect and debug.
### 3.8 Performance Optimization
* **Do This:** Profile your code to identify performance bottlenecks. Use appropriate data structures and algorithms. Avoid unnecessary memory allocations and copies. Consider using LLVM's intrinsics for optimized operations. Use tools like "perf" or "VTune" to analyze performance.
* **Don't Do This:** Make premature optimizations without profiling. Ignore performance implications.
**Why:** Optimizing performance is crucial for LLVM's functionality as a compiler infrastructure.
### 3.9 Security Best Practices
* **Do This:** Be aware of common security vulnerabilities (e.g., buffer overflows, format string bugs, integer overflows). Use safe coding practices to prevent these vulnerabilities. Validate external inputs. Consider using static analysis tools to detect security flaws.
* **Don't Do This:** Ignore security implications or introduce vulnerabilities into the codebase.
**Why:** Security vulnerabilities can compromise the integrity and reliability of LLVM-based tools.
### 3.10 Tooling Integration
* **Do This:** Integrate your tools with existing LLVM infrastructure, like FileCheck. Use standard LLVM libraries for common tasks. Contribute reusable components back to LLVM when appropriate.
* **Don't Do This:** Reinvent the wheel or create standalone tools that duplicate existing functionality.
**Why:** A unified ecosystem improves toolchain maintainability and reusability.
## 4. Recommended Libraries and Tools
* **clang-format:** Automatic code formatter for LLVM code style.
* **clang-tidy:** Static analysis tool for detecting code defects.
* **lit:** LLVM's integrated testing tool.
* **FileCheck:** Flexible pattern matching utility for testing.
* **CMake:** Cross-platform build system.
* **Ninja:** Fast build system.
* **valgrind:** Memory debugging and profiling tool.
* **gdb/lldb:** Debuggers for C++.
* **perf/VTune:** Performance analysis tools.
* **Phabricator:** Code review tool used by LLVM.
* **TableGen:** Tool for generating code from declarative descriptions.
## 5. Deprecated Features and Known Issues
* Refer to the latest LLVM release notes for information on deprecated features and known issues. Avoid using deprecated APIs or features.
* Be aware of potential bugs in third-party libraries or tools. Report any issues you find to the appropriate developers.
## 6. Conclusion
Following these tooling and ecosystem standards will contribute to a higher quality, more maintainable, and more efficient LLVM project. By adhering to these guidelines, developers can ensure consistency, improve collaboration, and prevent common pitfalls. These practices ensure that code is aligned with the constantly evolving environment and keeps the project's goals in focus.
danielsogl
Created Mar 6, 2025
This guide explains how to effectively use .clinerules
with Cline, the AI-powered coding assistant.
The .clinerules
file is a powerful configuration file that helps Cline understand your project's requirements, coding standards, and constraints. When placed in your project's root directory, it automatically guides Cline's behavior and ensures consistency across your codebase.
Place the .clinerules
file in your project's root directory. Cline automatically detects and follows these rules for all files within the project.
# Project Overview project: name: 'Your Project Name' description: 'Brief project description' stack: - technology: 'Framework/Language' version: 'X.Y.Z' - technology: 'Database' version: 'X.Y.Z'
# Code Standards standards: style: - 'Use consistent indentation (2 spaces)' - 'Follow language-specific naming conventions' documentation: - 'Include JSDoc comments for all functions' - 'Maintain up-to-date README files' testing: - 'Write unit tests for all new features' - 'Maintain minimum 80% code coverage'
# Security Guidelines security: authentication: - 'Implement proper token validation' - 'Use environment variables for secrets' dataProtection: - 'Sanitize all user inputs' - 'Implement proper error handling'
Be Specific
Maintain Organization
Regular Updates
# Common Patterns Example patterns: components: - pattern: 'Use functional components by default' - pattern: 'Implement error boundaries for component trees' stateManagement: - pattern: 'Use React Query for server state' - pattern: 'Implement proper loading states'
Commit the Rules
.clinerules
in version controlTeam Collaboration
Rules Not Being Applied
Conflicting Rules
Performance Considerations
# Basic .clinerules Example project: name: 'Web Application' type: 'Next.js Frontend' standards: - 'Use TypeScript for all new code' - 'Follow React best practices' - 'Implement proper error handling' testing: unit: - 'Jest for unit tests' - 'React Testing Library for components' e2e: - 'Cypress for end-to-end testing' documentation: required: - 'README.md in each major directory' - 'JSDoc comments for public APIs' - 'Changelog updates for all changes'
# Advanced .clinerules Example project: name: 'Enterprise Application' compliance: - 'GDPR requirements' - 'WCAG 2.1 AA accessibility' architecture: patterns: - 'Clean Architecture principles' - 'Domain-Driven Design concepts' security: requirements: - 'OAuth 2.0 authentication' - 'Rate limiting on all APIs' - 'Input validation with Zod'
# API Integration Standards for LLVM This document outlines the coding standards for integrating external APIs and backend services within the LLVM project. It focuses on patterns and practices that ensure maintainability, performance, and security, while adhering to the existing LLVM coding conventions. These guidelines aim to provide a consistent approach to API integration across the LLVM ecosystem. ## 1. Introduction Integrating LLVM components with external APIs and backend services requires careful consideration to maintain the project's stability, performance, and security. This document provides guidelines for creating robust, understandable, and maintainable integrations. It covers best practices for error handling, data serialization, asynchronous operations, authentication, and more. While LLVM traditionally avoids extensive external dependencies, certain tools and analyses may benefit significantly from external integration. This document addresses these scenarios, keeping the core LLVM principles in mind. ## 2. General Principles * **Minimize Dependencies:** Strive to minimize external dependencies. Evaluate the cost of introducing a new dependency against the benefits it provides. Consider if the functionality can be implemented within LLVM components. * **Do This:** Carefully evaluate whether an external dependency is truly necessary. * **Don't Do This:** Introduce dependencies without a thorough assessment of their impact on the project. * **Maintainability & Readability:** Code should be self-documenting, easy to understand, and follow LLVM's overall coding style. * **Do This:** Use meaningful variable and function names. Add comments explaining complex logic or integration points. Follow the LLVM coding style consistently. * **Don't Do This:** Write overly complex or cryptic code. Skimp on comments, especially around API calls. * **Error Handling:** Implement robust error handling to gracefully handle API failures. Log errors appropriately. * **Do This:** Use exception handling or checked error returns as appropriate. Provide informative error messages. * **Don't Do This:** Ignore error codes or exceptions. Allow exceptions to propagate uncaught. * **Security:** Protect against common security vulnerabilities (e.g., injection attacks, data breaches) when interacting with external APIs. * **Do This:** Validate all inputs from external APIs. Use secure communication protocols (HTTPS). Follow security best practices for the external platform. * **Don't Do This:** Trust data received from external APIs without validation. Store sensitive information unencrypted. ## 3. Connecting with Backend Services ### 3.1 Architectural Considerations * **Abstraction:** Introduce an abstraction layer to isolate the LLVM components from the specifics of the external API. This makes it easier to change the integration implementation or switch to a different API in the future. * **Why:** This reduces coupling and promotes a clear separation of concerns. * **Design Patterns:** Employ proven design patterns like Facade, Adapter, or Repository to structure the integration code. * **Why:** These patterns improve code organization, testability, and reduce code duplication. """c++ // Example: Facade pattern for API integration class ExternalAPI { public: virtual std::string fetchData(const std::string& query) = 0; virtual ~ExternalAPI() = default; }; class ConcreteExternalAPI : public ExternalAPI { public: std::string fetchData(const std::string& query) override { // Implementation to call the actual external API (e.g., using libcurl) // Example (placeholder): std::string result = "Data from external API for query: " + query; return result; } }; class LLVMDataService { public: LLVMDataService(ExternalAPI* api) : externalAPI(api) {} std::string retrieveData(const std::string& query) { // Perform LLVM-specific logic before calling the API llvm::outs() << "Preparing to fetch data for: " << query << "\n"; std::string data = externalAPI->fetchData(query); // Perform LLVM-specific logic after calling the API llvm::outs() << "Data retrieved successfully.\n"; return data; } private: ExternalAPI* externalAPI; }; // Usage: // ExternalAPI* apiImpl = new ConcreteExternalAPI(); // LLVMDataService service(apiImpl); // std::string data = service.retrieveData("some_query"); """ ### 3.2 Implementation Details * **HTTP Clients:** Use a well-established HTTP client library (e.g., "libcurl"). Ensure it is properly configured to handle TLS/SSL and other security concerns. Consider using a high-level library wrapper for easier use, but ensure it doesn't add significant overhead. * **Do This:** Choose a robust, widely used library with good support for security features. * **Don't Do This:** Roll your own HTTP client or use a deprecated library. * **Data Serialization:** Use a standardized data serialization format (e.g., JSON, Protocol Buffers). Use a library specifically designed for the format. * **Do This:** Choose a format appropriate for your data and performance requirements. Use "rapidjson" or "llvm::json" for smaller lightweight payloads where speed is critical. Use protobuf where schemas are well defined and versioning is a concern. * **Don't Do This:** Use custom or ad-hoc serialization formats. Serialize sensitive data without encryption. """c++ // Example: Using llvm::json for serializing and deserializing data #include "llvm/Support/JSON.h" #include <string> #include <vector> namespace llvm { void serializeData() { json::Object obj; obj["name"] = "Example Data"; obj["value"] = 42; obj["items"] = json::Array{1, 2, 3, 4, 5}; std::string jsonString = json::write(obj); llvm::outs() << "Serialized JSON: " << jsonString << "\n"; } void deserializeData(const std::string& jsonString) { Expected<json::Value> jsonValue = json::parse(jsonString); if (!jsonValue) { llvm::errs() << "Failed to parse JSON: " << toString(jsonValue.takeError()) << "\n"; return; } if (jsonValue->kind() != json::Value::Object) { llvm::errs() << "Expected JSON object.\n"; return; } json::Object& obj = *jsonValue->getAsObject(); std::string name = obj["name"]->getAsString().value(); int value = obj["value"]->getAsInteger().value(); json::Array& items = *obj["items"]->getAsArray(); llvm::outs() << "Name: " << name << "\n"; llvm::outs() << "Value: " << value << "\n"; llvm::outs() << "Items: "; for (auto& item : items) { llvm::outs() << item->getAsInteger().value() << " "; } llvm::outs() << "\n"; } } // namespace llvm // Usage: // llvm::serializeData(); // llvm::deserializeData(R"({"name":"Example Data","value":42,"items":[1,2,3,4,5]})"); """ * **Asynchronous Operations:** Use asynchronous operations to avoid blocking the main thread, especially for long-running API calls. Use "std::future" and "std::async". * **Do This:** Launch API calls in separate threads or use a non-blocking I/O model. * **Don't Do This:** Perform synchronous API calls on the main thread. * **Authentication:** Implement a secure authentication mechanism (e.g., OAuth 2.0, API keys). Store credentials securely (e.g., using a secrets manager). * **Do This:** Use established authentication protocols. Regularly rotate API keys. * **Don't Do This:** Hardcode credentials in the source code. Store credentials in plain text. Store secrets directly in git. ## 4. LLVM-Specific Considerations * **Integration Points:** Identify appropriate extension points within LLVM for integrating with external APIs. Common areas include: * **Passes:** Create a new pass to interact with the API. * **Analysis Utilities:** Extend analysis utilities to fetch data from external sources. * **Target-Specific CodeGen:** Modify target-specific code generation to leverage external services. * **LLVM Context:** Ensure the API calls do not interfere with the LLVM context or the overall compilation process. * **Do This:** Create a separate LLVM context for API-related operations (if necessary). Carefully synchronize access to shared resources. * **Don't Do This:** Directly modify the LLVM context from API callbacks without proper synchronization. * **Error Reporting:** Use LLVM's error reporting mechanisms to provide informative error messages related to API failures. * **Do This:** Use "llvm::errs()" and other LLVM error reporting tools to propagate errors to the user. * **Don't Do This:** Use "std::cerr" or other generic error streams. """c++ // Example: Reporting errors using llvm::errs() #include "llvm/Support/raw_ostream.h" void handleAPIError(const std::string& errorMessage) { llvm::errs() << "Error during API call: " << errorMessage << "\n"; } // Usage: // if (apiCallFailed) { // handleAPIError("Failed to retrieve data from the external service."); // } """ ## 5. Modern Approaches and Patterns * **gRPC:** Consider using gRPC for communication with backend services. gRPC is a high-performance, open-source universal RPC framework. Its advantages include: * **Protocol Buffers:** Uses Protocol Buffers for efficient serialization. * **Code Generation:** Automatically generates client and server code from protocol definitions. * **Multiple Languages:** Supports multiple programming languages. * **Microservices:** Design integrations following a microservices architecture. Decompose the API integration into smaller, independent services. * **Benefits:** Improved scalability, maintainability, and fault isolation. * **Considerations:** increased complexity and managing inter-service communication. * **Serverless Functions:** Use serverless functions (e.g., AWS Lambda, Azure Functions) to implement API integrations. * **Benefits:** Scalability, cost-effectiveness, and reduced operational overhead. ## 6. Common Anti-Patterns and Mistakes * **Tight Coupling:** Tightly coupling LLVM components with external APIs makes the code fragile and difficult to test/maintain. * **Ignoring Rate Limits:** Exceeding API rate limits can lead to service disruptions or being blocked. * **Lack of Monitoring:** Failing to monitor the health and performance of API integrations results in delayed problem detection and resolution. ## 7. Performance Optimization * **Caching:** Implement caching mechanisms to reduce the number of API calls, especially for frequently requested data. Use "llvm::StringMap" or other LLVM data structures for efficient storage. * **Do This:** Implement a cache with a reasonable expiration policy (TTL). Use a cache key that accurately reflects the data being cached. * **Don't Do This:** Cache sensitive data without proper encryption. Store an unbounded cache that grows indefinitely and consumes resources. * **Batching:** Batch multiple API requests into a single call to reduce network overhead. * **Why:** Reduces the overhead of multiple requests. * **Compression:** Enable data compression to reduce the size of data transmitted over the network. * **Why:** Reduces bandwidth usage and improves transfer speed. ## 8. Security Best Practices * **Input Validation:** Validate all inputs from external APIs to prevent injection attacks and other vulnerabilities. * **Do This:** Use whitelisting to allow only valid characters, formats, and lengths. Understand the specific validation requirements of the LLVM code consuming the external data. * **Don't Do This:** Trust data received from external APIs without validation. * **Data Encryption:** Encrypt sensitive data both in transit and at rest. Use TLS/SSL for communication over the network. * **Access Control:** Implement proper access control to restrict access to sensitive data and API endpoints. * **Regular Security Audits:** Conduct regular security audits to identify and address potential vulnerabilities. ## 9. Example: Integrating with a Simple REST API This example demonstrates a simplified integration with a REST API using "libcurl". It uses the Facade pattern to abstract the API interaction. """c++ #include "llvm/Support/raw_ostream.h" #include <curl/curl.h> #include <string> #include <stdexcept> namespace llvm { class RESTAPI { public: virtual std::string fetchData(const std::string& url) = 0; virtual ~RESTAPI() = default; }; class CurlRESTAPI : public RESTAPI { public: std::string fetchData(const std::string& url) override { std::string response; CURL* curl = curl_easy_init(); if (!curl) { throw std::runtime_error("Failed to initialize libcurl"); } curl_easy_setopt(curl, CURLOPT_URL, url.c_str()); curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writeCallback); curl_easy_setopt(curl, CURLOPT_WRITEDATA, &response); curl_easy_setopt(curl, CURLOPT_USERAGENT, "LLVM REST Client"); // Set a user agent CURLcode res = curl_easy_perform(curl); if (res != CURLE_OK) { curl_easy_cleanup(curl); throw std::runtime_error("curl_easy_perform() failed: " + std::string(curl_easy_strerror(res))); } curl_easy_cleanup(curl); return response; } private: static size_t writeCallback(void* contents, size_t size, size_t nmemb, std::string* output) { size_t totalSize = size * nmemb; output->append((char*)contents, totalSize); return totalSize; } }; class MyLLVMTool { public: MyLLVMTool(RESTAPI* api) : restAPI(api) {} std::string fetchDataFromAPI(const std::string& query) { try { std::string apiUrl = "https://api.example.com/data?q=" + query; std::string data = restAPI->fetchData(apiUrl); return data; } catch (const std::exception& e) { llvm::errs() << "Error fetching data: " << e.what() << "\n"; return ""; // Or handle the error more gracefully as needed } } private: RESTAPI* restAPI; }; } // namespace llvm // Usage: // llvm::RESTAPI* apiImpl = new llvm::CurlRESTAPI(); // llvm::MyLLVMTool tool{apiImpl}; // std::string data = tool.fetchDataFromAPI("some_query"); """ **Important Considerations:** * This example is simplified. Adapt it to your specific API requirements. * Replace ""https://api.example.com/data"" with the actual URL of the REST API. * Implement proper error handling (e.g., checking HTTP status codes). * Consider using a more robust JSON parsing library (e.g., "rapidjson", "llvm::json") for handling the API response. * Use dependency injection (as shown) to allow for easier testing with mock APIs. This document provides a comprehensive set of guidelines for integrating external APIs and backend services within LLVM. Adhering to these standards will help create robust, maintainable, and secure integrations that enhance the capabilities of the LLVM project.
# Component Design Standards for LLVM This document outlines the component design standards for LLVM, focusing on creating reusable, maintainable, and performant components within the LLVM ecosystem. It provides guidelines applicable to all LLVM subprojects, including the core compiler infrastructure, Clang, LLD, and related tools. These standards are designed to promote consistency, readability, and long-term maintainability of the LLVM codebase. ## 1. Principles of Component Design in LLVM ### 1.1 Abstraction and Encapsulation * **Do This:** Design components with clear abstractions that hide implementation details. * **Don't Do This:** Expose internal data structures or implementation details directly. **Why:** Abstraction simplifies the interface and reduces interdependence between components, making it easier to modify one component without affecting others. Encapsulation protects the internal state of a component from unintended external modification. **Code Example (Good):** """c++ // Good: Hiding implementation details behind an abstract interface. class TargetLoweringInfo { public: virtual ~TargetLoweringInfo() = default; virtual unsigned getRegForInlineAsm(const TargetRegisterClass *RC, MVT VT) const = 0; virtual CallingConv::ID getIRCallConv() const = 0; // ... other abstract methods }; // Concrete implementation, not exposed directly class AArch64TargetLowering : public TargetLoweringInfo { public: unsigned getRegForInlineAsm(const TargetRegisterClass *RC, MVT VT) const override { // ... implementation details for AArch64 return 0; // Dummy return } CallingConv::ID getIRCallConv() const override { //... return CallingConv::C; // dummy default } }; """ **Code Example (Bad):** """c++ // Bad: Exposing internal data structures. struct MyComponent { std::vector<int> internalData; // Directly accessible and modifiable. void processData(); // Public method. }; """ This example violates encapsulation by allowing direct access to the "internalData", which can lead to unintended side effects. ### 1.2 Single Responsibility Principle (SRP) * **Do This:** A component should have only one reason to change. * **Don't Do This:** Components should not be overly complex or perform unrelated tasks. Overly complex components are harder to understand, test, and maintain. **Why:** Following SRP makes components more focused, which improves their readability, testability, and reusability. **Code Example (Good):** """c++ // Good: Separate class for each responsibility. class InstructionSelector { public: MachineInstr *select(const BasicBlock &BB) { // ... instruction selection logic return nullptr; // Dummy return } }; class InstructionScheduler { public: void schedule(MachineInstr *instr) { // ... instruction scheduling logic } }; """ These classes are focused on their respective tasks: instruction selection and instruction scheduling. **Code Example (Bad):** """c++ // Bad: Combined responsibilities. class CompilerPass { public: void run(Module &M) { // ... instruction selection // ... instruction scheduling // ... register allocation } }; """ The "CompilerPass" class mixes instruction selection, scheduling, and register allocation, violating SRP. ### 1.3 Interface Segregation Principle (ISP) * **Do This:** Design interfaces that are specific to the clients that use them. * **Don't Do This:** Force clients to depend on interfaces they don't use. **Why:** ISP prevents unnecessary dependencies and reduces the impact of changes to interfaces. **Code Example (Good):** """c++ // Good: Segregated interfaces. class Printable { public: virtual void print() = 0; }; class Serializable { public: virtual void serialize() = 0; }; class MyClass : public Printable, public Serializable { public: void print() override { /*...*/ } void serialize() override { /*...*/ } }; """ Each interface focuses on a specific functionality. Components only implement the interfaces they need. **Code Example (Bad):** """c++ // Bad: Monolithic interface. class SomeInterface { public: virtual void methodA() = 0; virtual void methodB() = 0; virtual void methodC() = 0; }; class ClientA : public SomeInterface { public: void methodA() override { /*...*/ } void methodB() override { /*...*/ } void methodC() override { /* throw std::runtime_error("Not Implemented")*/ } // ClientA doesn't need this method, but is forced to implement it. }; """ This example forces "ClientA" to implement "methodC" even if it doesn't need it. ### 1.4 Dependency Inversion Principle (DIP) * **Do This:** Depend on abstractions, not concretions. * **Don't Do This:** Hardcode dependencies on concrete classes. **Why:** DIP promotes loose coupling and makes it easier to substitute implementations. **Code Example (Good):** """c++ // Good: Using dependency injection. class Logger { public: virtual void log(const std::string &message) = 0; }; class ConsoleLogger : public Logger { public: void log(const std::string &message) override { std::cout << message << std::endl; } }; class MyComponent { private: Logger *logger; public: MyComponent(Logger *logger) : logger(logger) {} void doSomething() { logger->log("Doing something..."); } }; """ The "MyComponent" depends on the "Logger" abstraction, not a concrete "ConsoleLogger". This allows substituting different loggers easily. **Code Example (Bad):** """c++ // Bad: Hardcoded dependency. class MyBadComponent { private: ConsoleLogger logger; // Hardcoded dependency on ConsoleLogger public: void doSomething() { logger.log("Doing something..."); } }; """ This example is tightly coupled to "ConsoleLogger", making it harder to test and reuse. ## 2. Component Structure and Organization ### 2.1 Directory Structure * **Do This:** Organize components into logical directories. * **Don't Do This:** Put all files in a single directory or create deeply nested directory structures that are hard to navigate. **Why:** A clear directory structure improves code discoverability and maintainability. LLVM generally follows a module-based structure. **Example:** """ llvm/ lib/ IR/ # LLVM Intermediate Representation AsmParser/ Core/ Instructions/ # Instruction definitions files (*.def) # Auto-generated instruction implementations (*.inc) Analysis/ # Static analyses Transforms/ # Transformations Scalar/ Vectorize/ include/ llvm/ IR/ Analysis/ Transforms/ tools/ llvm-as/ # LLVM assembler llvm-dis/ # LLVM disassembler """ ### 2.2 Naming Conventions * **Do This:** Use consistent naming conventions for files, classes, functions, and variables. * **Don't Do This:** Use cryptic or inconsistent names that are hard to understand. **Why:** Naming conventions improve code readability and reduce ambiguity. Names should be descriptive and follow LLVM's established conventions. **Examples:** * **Classes:** "ClassName" (CamelCase, starting with a capital letter) * **Functions:** "functionName" (camelCase, starting with a lowercase letter) * **Variables:** "variableName" (camelCase, starting with a lowercase letter) * **Files:** "ComponentName.cpp", "ComponentName.h" ### 2.3 Header Files * **Do This:** Use include guards to prevent multiple inclusions. Organize header files to minimize dependencies. Forward declare classes whenever possible. * **Don't Do This:** Include unnecessary header files. **Why:** Include guards prevent compilation errors. Minimizing includes reduces build times and dependencies. **Code Example:** """c++ // MyComponent.h #ifndef LLVM_MYCOMPONENT_H #define LLVM_MYCOMPONENT_H #include "llvm/Support/raw_ostream.h" // Only required headers namespace llvm { class MyClass; // Forward declaration class MyComponent { public: void doSomething(MyClass &obj); }; } // namespace llvm #endif // LLVM_MYCOMPONENT_H """ ## 3. Code Style and Formatting ### 3.1 LLVM Style * **Do This:** Follow the LLVM coding standards (see [LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html)). Use "clang-format" to automatically format your code. * **Don't Do This:** Deviate from the LLVM coding standards. * **Why:** Consistency improves readability and reduces cognitive load. * Use 2-space indentation. * Keep lines reasonably short (around 80-120 characters). * Use descriptive comments. ### 3.2 Using "clang-format" * **Do This:** Configure "clang-format" with the LLVM style. Run it before committing code. * **Don't Do This:** Rely on manual formatting or ignore "clang-format" warnings. **Why:** "clang-format" automates code formatting and ensures compliance with LLVM's style guidelines. **Configuration:** LLVM uses a ".clang-format" file in the root of the project to define the style. **Usage:** """bash clang-format -i MyComponent.cpp # Format the file in-place """ ### 3.3 Comments and Documentation * **Do This:** Write clear and concise comments that explain the purpose and behavior of your code. Use Doxygen to generate API documentation. Provide usage examples. * **Don't Do This:** Write redundant comments that simply repeat what the code does. **Why:** Good comments improve code understanding and maintainability. Doxygen documentation makes it easy to generate API references automatically. **Code Example:** """c++ /** * @brief Calculates the average value of a vector of integers. * * @param values The vector of integers. * @return The average value, or 0.0 if the vector is empty. */ double calculateAverage(const std::vector<int> &values) { if (values.empty()) { return 0.0; } double sum = 0.0; for (int value : values) { sum += value; } return sum / values.size(); } """ ## 4. Error Handling and Assertions ### 4.1 LLVM Error Handling * **Do This:** Use "llvm::Error" to represent and propagate errors. Employ "Expected<T>" to represent a value that might be an error. * **Don't Do This:** Use exceptions for normal error handling. * **Why:** "llvm::Error" provides a consistent and efficient way to handle errors within LLVM. **Code Example:** """c++ #include "llvm/Support/Error.h" #include "llvm/Support/raw_ostream.h" llvm::Error doSomething(int value) { if (value < 0) { return llvm::make_error<llvm::StringError>("Value must be non-negative", llvm::inconvertibleErrorCode()); } // ... do something return llvm::Error::success(); } llvm::Expected<int> computeValue(int input) { if (input == 0) { return llvm::make_error<llvm::StringError>("Input cannot be zero", llvm::inconvertibleErrorCode()); } return input * 2; } int main() { if (auto Err = doSomething(-1)) { llvm::errs() << "Error: " << Err << "\n"; return 1; } auto Result = computeValue(10); if (!Result) { llvm::errs() << "Error: " << Result.takeError() << "\n"; return 1; } llvm::outs() << "Value: " << *Result << "\n"; return 0; } """ ### 4.2 Assertions * **Do This:** Use "llvm::support::llvm_unreachable" for cases that should never occur. Use "assert" liberally to check preconditions and invariants during development. * **Don't Do This:** Use assertions for handling expected errors or user input validation. **Why:** Assertions help catch bugs early in development. "llvm_unreachable" indicates that a code path is guaranteed to be unreachable. **Code Example:** """c++ #include "llvm/Support/raw_ostream.h" #include "llvm/Support/ErrorHandling.h" int getValue(int index) { assert(index >= 0 && index < 10 && "Index out of bounds"); // Precondition. if (index == 5) { llvm::support::llvm_unreachable("This should never happen!"); } return index * 2; } """ ## 5. Performance Considerations ### 5.1 Data Structures Selection * **Do This:** Choose appropriate data structures based on performance requirements. Consider using "llvm::SmallVector", "llvm::DenseMap", "llvm::SetVector", and other LLVM-specific data structures. * **Don't Do This:** Use standard library containers without considering performance implications. **Why:** LLVM-specific data structures are often optimized for common LLVM use cases. "SmallVector" avoids dynamic allocation for small numbers of elements. "DenseMap" is optimized for integer and pointer keys. **Code Example:** """c++ #include "llvm/ADT/SmallVector.h" void processValues(llvm::SmallVector<int, 32> &values) { // ... process values efficiently } #include "llvm/ADT/DenseMap.h" void processMap(llvm::DenseMap<int, int> &map) { // ... process map efficiently } """ ### 5.2 Memory Management * **Do This:** Use RAII (Resource Acquisition Is Initialization) to manage resources. Avoid manual memory management. When manual memory management is unavoidable, use smart pointers ("std::unique_ptr", "std::shared_ptr"). * **Don't Do This:** Use "new" and "delete" directly. Leak memory. **Why:** RAII and smart pointers automate resource management and prevent memory leaks. **Code Example:** """c++ #include <memory> class MyResource { public: MyResource() { /* Acquire resource */ } ~MyResource() { /* Release resource */ } }; void doSomething() { std::unique_ptr<MyResource> resource(new MyResource()); // Resource is automatically released when resource goes out of scope. } """ ### 5.3 Code Optimization * **Do This:** Profile your code. Use optimization flags ("-O2", "-O3"). Consider using profile-guided optimization (PGO). Minimize unnecessary computations. * **Don't Do This:** Optimize prematurely without profiling. Ignore performance bottlenecks. **Why:** Profiling identifies performance bottlenecks. Optimization flags and PGO improve code performance. ## 6. Testing ### 6.1 Unit Tests * **Do This:** Write unit tests for each component. Use LLVM's testing framework, including lit. * **Don't Do This:** Neglect testing. Commit code without tests. **Why:** Unit tests verify the correctness of components and prevent regressions. ### 6.2 Integration Tests * **Do This:** Write integration tests to ensure that components work together correctly. * **Don't Do This:** Assume that components will work together without testing. **Why:** Integration tests verify the interactions between components and catch integration issues. ### 6.3 Regression Tests * **Do This:** Add regression tests for each bug fix to prevent regressions. **Why:** Regression tests ensure that bug fixes are not inadvertently undone by future changes. ## 7. Concurrency and Thread Safety ### 7.1 Thread Safety * **Do This:** Design components to be thread-safe if they will be used in a multi-threaded environment. Use appropriate locking mechanisms ("llvm::sys::Mutex", "llvm::sys::LockGuard"). * **Don't Do This:** Share mutable state between threads without proper synchronization. Introduce race conditions. **Why:** Thread safety prevents data corruption and undefined behavior in concurrent environments. **Code Example:** """c++ #include "llvm/Support/Threading.h" class ThreadSafeComponent { private: llvm::sys::Mutex mutex; int state; public: int getState() { llvm::sys::LockGuard<llvm::sys::Mutex> lock(mutex); return state; } void setState(int newState) { llvm::sys::LockGuard<llvm::sys::Mutex> lock(mutex); state = newState; } }; """ Following these component design standards will result in a more robust, maintainable, and performant LLVM codebase. Adherence to these guidelines is crucial for fostering a healthy and collaborative development environment.
# Code Style and Conventions Standards for LLVM This document outlines the code style and conventions standards for the LLVM project. Adhering to these guidelines ensures code consistency, readability, and maintainability, which are crucial for a large and complex project like LLVM. These guidelines are intended to be used by both human developers and AI coding assistants to improve the quality and consistency of LLVM code. ## 1. General Formatting ### 1.1. Indentation and Whitespace * **Do This:** Use 2 spaces for indentation. Tabs should *never* be used. * **Don't Do This:** Use tabs or more than 2 spaces for indentation. * **Why:** Consistency in indentation is vital for readability. Two spaces provide a good balance between code nesting and horizontal space consumption. * **Example:** """c++ if (condition) { for (int i = 0; i < 10; ++i) { // Code within the loop doSomething(i); } } else { // Alternative code } """ ### 1.2. Line Length * **Do This:** Keep lines under 80 characters where practical. Aim for readability, and don't obsess over fitting everything into 80 characters if it harms clarity. * **Don't Do This:** Allow lines to routinely exceed 120 characters, making them hard to read on smaller screens or in diffs. * **Why:** Shorter lines improve readability and facilitate code review by allowing side-by-side comparisons in diff tools. * **Example:** """c++ // Good: Line split for readability Value *result = builder->CreateAdd(operand1, operand2, "sum"); // Bad: Long line, harder to read Value *result = builder->CreateAdd(operand1, operand2, "very_long_variable_name_that_makes_the_line_exceed_80_characters"); """ ### 1.3. Whitespace Usage * **Do This:** * Use a single space after keywords like "if", "for", "while", and "switch". * Use a single space around operators like "=", "+", "-", "*", "/", "==", "!=", "<", ">", "<=", ">=", "&&", "||". * Do not use spaces inside parentheses, brackets, or braces except where needed for clarity. * **Don't Do This:** * Omit spaces after keywords or around operators. * Add excessive spaces inside parentheses, brackets, or braces. * **Why:** Consistent whitespace improves readability and makes the code visually less cluttered. * **Example:** """c++ // Good if (x == 5) { y = z + 1; } // Bad if(x==5){ y=z+1; } """ ### 1.4. Vertical Whitespace * **Do This:** Use blank lines to separate logical blocks of code, such as function definitions, major sections within a function, and between different data structures. * **Don't Do This:** Overuse or underuse blank lines, resulting in either scattered or crammed code. * **Why:** Judicious use of vertical whitespace enhances the visual structure of the code, making it easier to understand. * **Example:** """c++ // Good void processData() { // Initialize variables int count = 0; std::vector<int> data; // Load data from file loadDataFromFile("data.txt", data); // Process data for (int value : data) { count += value; } // Print result std::cout << "Total: " << count << std::endl; } // Bad (crammed) void processData(){int count=0;std::vector<int> data;loadDataFromFile("data.txt",data);for(int value:data){count+=value;}std::cout<<"Total: "<<count<<std::endl;} """ ## 2. Naming Conventions ### 2.1. General Naming * **Do This:** * Use descriptive and meaningful names. * Prefer clear and explicit names over short and cryptic ones. * Be consistent in applying the same naming scheme across the project. * **Don't Do This:** * Use single-character variable names (except in very short loops). * Use abbreviations that are not widely understood in the LLVM community. * **Why:** Good naming significantly enhances code readability and reduces cognitive load. * **Example:**: "for (int i = 0; i < N; ++i)" is often fine, but "for (int elementIndex = 0; elementIndex < numberOfElements; ++elementIndex)" is easier to follow if the loop is more complex. ### 2.2. Variable Naming * **Do This:** Use "camelCase" for local variable names. * **Don't Do This:** Use "snake_case" or "PascalCase" for local variables. * **Why:** "camelCase" is a common convention in LLVM for local variables. * **Example:** """c++ int numberOfItems = 10; std::string itemName = "Example"; """ ### 2.3. Function Naming * **Do This:** Use "camelCase" for function names. Function names should generally be verbs or verb phrases indicating the action they perform. * **Don't Do This:** Use "snake_case" or "PascalCase" for function names. * **Why:** Consistency in function naming is important. Verb-based names accurately describe what functions do. * **Example:** """c++ int calculateSum(int a, int b) { return a + b; } void processData() { // ... } """ ### 2.4. Class and Struct Naming * **Do This:** Use "PascalCase" for class and struct names. * **Don't Do This:** Use "camelCase" or "snake_case" for class and struct names. * **Why:** "PascalCase" is the standard convention for class and struct names in LLVM. * **Example:** """c++ class MyClass { public: // ... }; struct DataStructure { int value; }; """ ### 2.5. Constant Naming * **Do This:** Use "PascalCase" for named constants (i.e. those defined with "static const"). Use all-uppercase "SCREAMING_SNAKE_CASE" for "#define" constants. Prefer "static const" over "#define" whenever possible. * **Don't Do This:** Use "camelCase" or "snake_case" for constants. * **Why:** Differentiating constants from variables helps in understanding the code. * **Example:** """c++ static const int MaxValue = 100; #define ARRAY_SIZE 256 """ ### 2.6. Enum Naming * **Do This:** Use "PascalCase" for enum names and "PascalCase" for enum values. * **Don't Do This:** Use "camelCase" or "snake_case" for enum names and values. * **Why:** Consistent enum naming enhances code clarity. * **Example:** """c++ enum class Color { Red, Green, Blue }; """ ### 2.7. Template Parameter Naming * **Do This:** Use a single uppercase letter, or a descriptive name starting with an uppercase letter. When a descriptive name is used, it should match the concept that the template parameter represents. * **Don't Do This:** Use unclear abbreviations. * **Why:** Template parameters should be easily identifiable within the template. * **Example:** """c++ template <typename T> T add(T a, T b) { return a + b; } template <typename ElementType> class MyVector { // ... }; """ ## 3. Comments ### 3.1. General Commenting * **Do This:** * Write clear and concise comments to explain complex logic and design decisions. * Keep comments up-to-date with code changes. * Use proper grammar and spelling in comments. * **Don't Do This:** * Write obvious comments that simply restate the code. * Leave outdated or incorrect comments. * Use excessive jargon or abbreviations without explanation. * **Why:** Comments are essential for understanding the code's purpose and functionality. Clear and accurate comments reduce maintenance effort and improve collaboration. ### 3.2. Doxygen-Style Comments * **Do This:** Use Doxygen-style comments for documenting functions, classes, and files. * **Don't Do This:** Neglect to document the purpose, parameters, and return values of functions and classes. * **Why:** Doxygen-style comments allow automatic documentation generation, which is crucial for large projects like LLVM. * **Example:** """c++ /** * @brief Calculates the sum of two integers. * * This function adds two integers and returns the result. * * @param a The first integer. * @param b The second integer. * @return The sum of a and b. */ int calculateSum(int a, int b) { return a + b; } """ ### 3.3. Inline Comments * **Do This:** Use inline comments to explain specific lines or blocks of code that are not immediately obvious. * **Don't Do This:** Overuse inline comments for trivial code. * **Why:** Inline comments clarify complex logic or non-obvious operations. * **Example:** """c++ for (int i = 0; i < 10; ++i) { // Multiply i by 2 to get the even number int evenNumber = i * 2; // ... } """ ## 4. Code Structure and Design ### 4.1. Function Length * **Do This:** Keep functions reasonably short (typically under 50 lines). If a function becomes too long, refactor it into smaller, more manageable functions. * **Don't Do This:** Write very long functions that perform multiple unrelated tasks. * **Why:** Shorter functions are easier to understand, test, and maintain. They also promote code reuse. ### 4.2. Class Design * **Do This:** * Adhere to the Single Responsibility Principle (SRP): each class should have one specific responsibility. * Use proper encapsulation: keep internal state private and provide access through public methods. * Use inheritance and polymorphism appropriately to model relationships between classes. * **Don't Do This:** * Create "god classes" that do everything. * Expose internal state directly. * Overuse inheritance, leading to complex and fragile class hierarchies. * **Why:** Good class design promotes modularity, reusability, and maintainability. ### 4.3. Error Handling * **Do This:** * Use exceptions for exceptional cases (e.g., invalid input, resource allocation failure). * Handle errors gracefully and provide informative error messages. * Use "llvm::Error" to handle recoverable errors. * **Don't Do This:** * Ignore errors or handle them silently. * Use exceptions for normal control flow. * **Why:** Robust error handling is crucial for preventing crashes and providing a good user experience. * **Example:** """c++ #include "llvm/Support/Error.h" #include "llvm/Support/raw_ostream.h" llvm::Error processData(int value) { if (value < 0) { return llvm::make_error<llvm::StringError>("Invalid value: " + std::to_string(value), llvm::inconvertibleErrorCode()); } // Process data llvm::outs() << "Processing value: " << value << "\n"; return llvm::Error::success(); } int main() { llvm::Error err1 = processData(10); if (err1) { llvm::errs() << "Error: " << llvm::toString(std::move(err1)) << "\n"; } llvm::Error err2 = processData(-5); if (err2) { llvm::errs() << "Error: " << llvm::toString(std::move(err2)) << "\n"; } return 0; } """ ### 4.4. Resource Management * **Do This:** * Use RAII (Resource Acquisition Is Initialization) to manage resources like memory, file handles, and locks. * Use smart pointers ("std::unique_ptr", "std::shared_ptr") to automatically manage dynamically allocated memory. * **Don't Do This:** * Manually allocate and deallocate memory using "new" and "delete" without proper RAII. * Leak resources. * **Why:** RAII ensures that resources are properly released, even in the presence of exceptions, preventing resource leaks and improving code reliability. ### 4.5. Modern C++ Features * **Do This:** * Use modern C++ features like lambda expressions, range-based for loops, and auto type deduction. * Prefer "constexpr" for compile-time constants. * Use move semantics to avoid unnecessary copying of objects. * **Don't Do This:** * Rely on deprecated C++ features. * Write verbose code that can be simplified with modern features. * **Why:** Modern C++ features can significantly improve code readability, efficiency, and safety. * **Example:** """c++ #include <iostream> #include <vector> int main() { std::vector<int> data = {1, 2, 3, 4, 5}; // Range-based for loop for (int value : data) { std::cout << value << " "; } std::cout << std::endl; // Lambda expression auto multiplyByTwo = [](int x) { return x * 2; }; std::vector<int> multipliedData; for (int value : data) { multipliedData.push_back(multiplyByTwo(value)); } for (int value : multipliedData) { std::cout << value << " "; } std::cout << std::endl; // Auto type deduction auto sum = 0; for (auto value : data) { sum += value; } std::cout << "Sum: " << sum << std::endl; return 0; } """ ## 5. LLVM-Specific Guidelines ### 5.1. LLVM Coding Conventions * **Do This:** * Follow the existing coding style and conventions of the specific LLVM component you are working on. * Look at existing files or recent commits within the specific directory to determine local conventions. * **Don't Do This:** * Introduce new coding styles that are inconsistent with the rest of the component. * **Why:** Consistency within a component is crucial for maintainability and readability. ### 5.2. LLVM Data Structures * **Do This:** * Use LLVM-specific data structures like "SmallVector", "StringRef", and "ArrayRef" where appropriate. These are optimized for common LLVM use cases. * **Don't Do This:** * Use standard library containers ("std::vector", "std::string") without carefully considering performance implications. * **Why:** LLVM data structures are designed to be efficient and memory-friendly for specific LLVM tasks. ### 5.3. LLVM Diagnostic Infrastructure * **Do This:** * Use the LLVM diagnostic infrastructure ("llvm::DiagnosticInfo", "llvm::SourceMgr", "llvm::LLVMContext") for reporting errors, warnings, and remarks. * **Don't Do This:** * Print diagnostic messages directly to "std::cerr" or "std::cout". * **Why:** The LLVM diagnostic infrastructure provides a consistent and extensible way to report diagnostic information, allowing tools to handle diagnostics in a uniform manner. ### 5.4. LLVM Pass Infrastructure * **Do This:** * Use the LLVM pass infrastructure for implementing compiler passes. * Follow the standard pass structure, including the "runOnFunction" or "runOnModule" methods. * **Don't Do This:** * Implement custom pass management mechanisms. * **Why:** The LLVM pass infrastructure provides a uniform and efficient way to implement and manage compiler passes. ### 5.5. Including LLVM Headers * **Do This:** Include headers using angle brackets "<>" for system headers and LLVM headers that are part of the LLVM distribution. Use quotes """" for headers that are local to your project or component. Order includes alphabetically within each category. * **Don't Do This:** Mix include styles or use relative paths for LLVM distribution headers. * **Why:** This helps distinguish standard library headers from project-specific headers and keeps include paths clean and maintainable. **Example:** """c++ #include <algorithm> #include <iostream> #include <vector> #include "llvm/ADT/SmallVector.h" #include "llvm/IR/Function.h" #include "llvm/Pass.h" #include "MyComponent/MyHeader.h" """ ## 6. Security Best Practices ### 6.1. Input Validation * **Do This:** Validate all external inputs to prevent vulnerabilities such as buffer overflows, format string bugs, and code injection. * **Don't Do This:** Assume that external inputs are safe or well-formed. * **Why:** Input validation is essential for preventing security vulnerabilities. ### 6.2. Memory Safety * **Do This:** * Use memory-safe programming techniques (e.g., bounds checking, smart pointers). * Be careful when using raw pointers and manual memory management. * **Don't Do This:** * Write code that is prone to buffer overflows, use-after-free errors, or other memory-related vulnerabilities. * **Why:** Memory safety is crucial for preventing security exploits. ### 6.3. Integer Overflows * **Do This:** Check for integer overflows when performing arithmetic operations, especially when dealing with sizes and indices. * **Don't Do This:** Assume that integer arithmetic is always safe. * **Why:** Integer overflows can lead to unexpected behavior and security vulnerabilities. ### 6.4. Safe String Handling * **Do This:** Use safe string handling functions (e.g., "llvm::StringRef::startswith", "llvm::StringRef::endswith") to avoid buffer overflows and format string bugs. * **Don't Do This:** Use unsafe string functions like "strcpy" or "sprintf". * **Why:** Safe string handling prevents common security vulnerabilities related to string manipulation. ## 7. Performance Optimization ### 7.1. Data Locality * **Do This:** Design data structures and algorithms to maximize data locality, which can improve cache utilization and reduce memory access latency. * **Don't Do This:** Access memory in a random or scattered manner. * **Why:** Data locality can significantly improve performance, especially for memory-bound applications. ### 7.2. Avoiding Unnecessary Copies * **Do This:** Use move semantics to avoid unnecessary copying of objects. * **Don't Do This:** Pass large objects by value when they can be passed by reference or moved. * **Why:** Copying large objects can be expensive, especially when they are not modified. ### 7.3. Efficient Algorithms * **Do This:** Choose efficient algorithms (e.g., sorting, searching) that are appropriate for the specific task. * **Don't Do This:** Use naive or inefficient algorithms that can lead to poor performance. * **Why:** Algorithm choice has a significant impact on performance. ### 7.4. Profiling * **Do This:** Use profiling tools to identify performance bottlenecks and optimize the most critical sections of code. Consider using LLVM's built-in profiling capabilities. * **Don't Do This:** Guess at performance issues without empirical evidence. * **Why:** Profiling provides valuable insights into performance bottlenecks and helps focus optimization efforts on the most critical areas. ## 8. Tooling and Automation ### 8.1. Clang-Format * **Do This:** Use "clang-format" to automatically format code according to the LLVM coding style. Configure your IDE or editor to automatically run clang-format on save. * **Don't Do This:** Manually format code or ignore "clang-format" warnings. * **Why:** "clang-format" ensures consistent code formatting and reduces the burden of manual formatting. ### 8.2. Clang-Tidy * **Do This:** Use "clang-tidy" to automatically check code for style violations, potential bugs, and security vulnerabilities. Configure your build system to run "clang-tidy" as part of the build process. * **Don't Do This:** Ignore "clang-tidy" warnings or disable checks without a valid reason. * **Why:** "clang-tidy" helps identify and fix issues early in the development cycle. ### 8.3. Continuous Integration * **Do This:** Use a continuous integration (CI) system to automatically build, test, and analyze code changes before they are merged into the main branch. * **Don't Do This:** Merge code changes without proper CI testing. * **Why:** CI ensures that code changes do not break the build, introduce new bugs, or violate coding standards. By adhering to these code style and conventions standards, LLVM developers can contribute to a more consistent, readable, and maintainable codebase, ultimately leading to a better and more secure compiler infrastructure. These standards are intended to guide both human developers and AI coding assistants in producing high-quality LLVM code.
# Deployment and DevOps Standards for LLVM This document outlines deployment and DevOps standards for LLVM projects. It focuses on build processes, continuous integration/continuous deployment (CI/CD), and production considerations. The goal is to ensure LLVM projects are built, tested, and deployed efficiently, reliably, and securely. ## 1. Build Processes A well-defined build process is crucial for producing consistent and reproducible artifacts. LLVM uses CMake as its primary build system. ### 1.1. CMake Standards CMake is fundamental to building LLVM. Proper usage ensures portability and maintainability. **Do This:** * Use CMake targets extensively. This makes dependencies explicit and simplifies the build graph. * Employ generator expressions ("$<...>") for conditional compilation based on build configurations (Debug, Release, etc.). * Use CMake modules to encapsulate common build logic. * Leverage features provided by the "LLVM-Config.cmake" module (installed with LLVM) within projects using LLVM. * Prefer "target_link_libraries", "target_include_directories", "target_compile_definitions", and "target_compile_features" over global settings. **Don't Do This:** * Avoid direct manipulation of compiler flags (e.g., setting "CXX_FLAGS" directly). Use CMake's built-in mechanisms instead. * Don't overuse "execute_process" for tasks that can be handled by CMake commands. * Don't hardcode paths; rely on CMake variables (e.g., "CMAKE_SOURCE_DIR", "CMAKE_BINARY_DIR"). **Why:** CMake provides a platform-independent and well-structured way to manage the build process. Adhering to these standards ensures portability and simplifies maintainability. **Example:** """cmake # CMakeLists.txt cmake_minimum_required(VERSION 3.13) # Ensure a modern CMake version project(MyProject) # Find LLVM find_package(LLVM REQUIRED CONFIG) message(STATUS "Found LLVM ${LLVM_PACKAGE_VERSION}") include_directories(${LLVM_INCLUDE_DIRS}) add_executable(MyTool MyTool.cpp) # Link against LLVM libraries target_link_libraries(MyTool LLVMSupport LLVMCore) # Add compile definitions based on build type: target_compile_definitions(MyTool PRIVATE $
# Core Architecture Standards for LLVM This document outlines the core architecture standards for LLVM development, providing guidelines to ensure consistency, maintainability, performance, and security within the LLVM project. It focuses on the fundamental architectural patterns, project structure, and organization principles that govern the LLVM codebase. ## 1. Fundamental Architectural Patterns ### 1.1. The Three-Phase Design (Frontend, Optimizer, Backend) **Description:** LLVM employs a three-phase design: a frontend that parses source code into an intermediate representation (IR), an optimizer that performs transformations on the IR, and a backend that translates the IR into machine code. This separation of concerns is a cornerstone of LLVM's flexibility and retargetability. **Do This:** * Design components within the frontend, optimizer, or backend that adhere to their respective responsibilities. Avoid mixing concerns across phases. * Ensure each phase communicates through the well-defined LLVM IR. * Frontend components should aim to generate semantically equivalent IR regardless of the source language's specific syntax. * Backends should focus on target machine specific optimizations and code generation without modifying program semantics beyond that which is required for the target architecture. **Don't Do This:** * Implement source language-specific optimizations in the backend. These transformations belong in the optimizer (or potentially the frontend during initial lowering). * Bypass the IR for direct communication between frontends and backends. This breaks the modularity and retargetability. **Why This Matters:** * **Maintainability:** Clear separation makes it easier to understand, modify, and extend individual phases without affecting others. * **Retargetability:** The IR acts as a stable interface, allowing new frontends and backends to be added without requiring changes to the core optimizer. * **Optimization:** Centralized optimization within the optimizer phase allows for target-independent improvements that benefit all languages and architectures. **Code Example (Illustrative):** """c++ // Frontend (Clang) - Generates LLVM IR from C++ source // (Simplified example) llvm::Module *generateIR(const char *src) { // Parse C++ code and build AST // ... // Transform AST into LLVM IR instructions llvm::LLVMContext &context = llvm::getGlobalContext(); auto module = std::make_unique<llvm::Module>("my_module", context); llvm::FunctionType *funcType = llvm::FunctionType::get(llvm::Type::getInt32Ty(context), false); llvm::Function *mainFunc = llvm::Function::Create(funcType, llvm::Function::ExternalLinkage, "main", module.get()); // Create a basic block and insert instructions (e.g., return 0) llvm::BasicBlock *entry = llvm::BasicBlock::Create(context, "entrypoint", mainFunc); llvm::IRBuilder<> builder(entry); llvm::Value *retVal = llvm::ConstantInt::get(llvm::Type::getInt32Ty(context), 0); builder.CreateRet(retVal); return module.release(); } // Optimizer (LLVM core) - Optimizes the LLVM IR llvm::Module *optimizeIR(llvm::Module *module) { llvm::FunctionPassManager fpm; // Add optimization passes (e.g., dead code elimination, constant propagation) fpm.addPass(llvm::createInstructionCombiningPass()); fpm.addPass(llvm::createReassociatePass()); fpm.addPass(llvm::createGVNPass()); fpm.addPass(llvm::createCFGSimplificationPass()); llvm::ModulePassManager mpm; mpm.addPass(llvm::createModuleToFunctionPassAdaptor(std::move(fpm))); mpm.run(*module); return module; } // Backend (e.g., x86) - Generates machine code from optimized IR void generateMachineCode(llvm::Module *module) { // Target-specific code generation logic // ... //Use TargetMachine to emit native code. } """ ### 1.2. The Module, Function, BasicBlock, and Instruction Hierarchy **Description:** LLVM IR is structured hierarchically: A "Module" contains "Function"s, which contain "BasicBlock"s, which contain "Instruction"s. This structure models the program's organization and control flow. **Do This:** * Understand and leverage this hierarchy when manipulating IR. Functions represent procedures, basic blocks represent straight-line code segments, and instructions are the fundamental operations. * Use LLVM's APIs to traverse and modify the IR structure. * Follow conventions for naming functions, basic blocks, and instructions to improve readability. Consider using debug metadata to preserve source-level names. **Don't Do This:** * Treat the IR as a flat list of instructions. The hierarchical structure enables powerful analyses and transformations. * Manually manipulate raw pointers to IR objects. Use the provided LLVM APIs (e.g., iterators, "replaceUsesWith") for safe and correct manipulation. **Why This Matters:** * **Organization:** The hierarchy reflects the program's structure, enabling modular analysis and transformation. * **Data Flow:** The basic block structure naturally aligns with data flow analysis. * **Analysis and Optimization:** Many optimization passes rely on the hierarchical structure, such as function-level inlining, basic block reordering, and loop unrolling. **Code Example:** """c++ // Creating a function and a basic block, adding instructions llvm::Function *createFunction(llvm::Module *module, const std::string &name) { llvm::LLVMContext &context = module->getContext(); llvm::FunctionType *funcType = llvm::FunctionType::get(llvm::Type::getInt32Ty(context), false); llvm::Function *func = llvm::Function::Create(funcType, llvm::Function::ExternalLinkage, name, module); llvm::BasicBlock *entryBB = llvm::BasicBlock::Create(context, "entry", func); llvm::IRBuilder<> builder(entryBB); // Create an integer constant llvm::Value *constant = llvm::ConstantInt::get(llvm::Type::getInt32Ty(context), 42); // Create a return instruction builder.CreateRet(constant); return func; } // Iterating through instructions in a basic block void printInstructions(llvm::Function *func) { for (llvm::BasicBlock &bb : *func) { llvm::outs() << "Basic Block: " << bb.getName() << "\n"; for (llvm::Instruction &inst : bb) { llvm::outs() << " " << inst << "\n"; } } } """ ## 2. Project Structure and Organization ### 2.1. Directory Structure Conventions **Description:** LLVM follows a strict directory structure to organize source code. This structure promotes discoverability and reduces namespace collisions. **Do This:** * Place source files related to a specific component (e.g., a backend, an optimization pass) within a dedicated directory under the appropriate top-level directory (e.g., "lib/Target/", "lib/Transforms/"). * Use meaningful directory and file names that clearly indicate the component's purpose. * Maintain consistent naming conventions (e.g., camelCase for class names, snake_case for function names). **Don't Do This:** * Place unrelated source files in the same directory. * Create deeply nested directory structures that make it difficult to navigate the codebase. * Ignore the existing directory structure and create new directories without justification. **Why This Matters:** * **Discoverability:** A well-defined directory structure makes it easier to find and understand the code related to a specific feature. * **Namespace Management:** Separating components into different directories reduces the risk of naming conflicts. * **Build System Integration:** The directory structure is closely tied to the build system, ensuring that source files are compiled and linked correctly. **Example:** For instance, the X86 backend resides in "lib/Target/X86/". Within that directory, you will find subdirectories like "AsmPrinter", "Disassembler", "MCTargetDesc", etc., each dedicated to a distinct aspect of the backend. ### 2.2. Component-Based Design **Description:** LLVM's architecture promotes modularity through the use of components. Each component encapsulates a specific functionality and exposes a well-defined interface. **Do This:** * Design new features as independent components with clear interfaces. * Minimize dependencies between components to improve maintainability and testability. * Use the pimpl idiom (pointer to implementation) to hide implementation details and ensure binary compatibility. * Consider a class hierarchy for extensibility if multiple derived classes provide differing implementations of an interface. Using the visitor pattern for manipulation of these classes. **Don't Do This:** * Create monolithic components that perform multiple unrelated tasks. * Introduce tight coupling between components, making it difficult to modify or replace them independently. **Why This Matters:** * **Maintainability:** Components can be developed, tested, and deployed independently. * **Reusability:** Components can be reused in different parts of the system. * **Testability:** Components can be tested in isolation with minimal dependencies. **Code Example (Pimpl Idiom):** """c++ // Header file (MyComponent.h) class MyComponent { public: MyComponent(); ~MyComponent(); void doSomething(); private: class Impl; Impl *impl; }; // Source file (MyComponent.cpp) #include "MyComponent.h" class MyComponent::Impl { public: void doSomethingImpl() { // Implementation details llvm::outs() << "MyComponent is doing something!\n"; } }; MyComponent::MyComponent() : impl(new Impl()) {} MyComponent::~MyComponent() { delete impl; } void MyComponent::doSomething() { impl->doSomethingImpl(); } """ ## 3. Memory Management ### 3.1. RAII (Resource Acquisition Is Initialization) **Description:** LLVM heavily relies on RAII to manage resources, particularly memory. This approach ensures that resources are automatically released when an object goes out of scope. **Do This:** * Use smart pointers (e.g., "std::unique_ptr", "std::shared_ptr") or custom RAII classes to manage dynamically allocated memory. * Prefer "std::unique_ptr" for exclusive ownership and "std::shared_ptr" for shared ownership. * Avoid raw "new" and "delete" whenever possible. **Don't Do This:** * Manually allocate and deallocate memory without using RAII. * Forget to release allocated memory, leading to memory leaks. **Why This Matters:** * **Memory Safety:** RAII ensures that memory is always released, even in the presence of exceptions. * **Resource Management:** RAII can be used to manage other resources besides memory, such as file handles and locks. * **Code Clarity:** RAII makes code easier to read and understand by tying resource lifetime to object lifetime. **Code Example:** """c++ // Using std::unique_ptr for memory management #include <memory> void processData() { std::unique_ptr<int[]> data(new int[100]); // Allocate an array // ... use the array } // The array is automatically deallocated when data goes out of scope """ ### 3.2. LLVM's Memory Allocators (BumpPtrAllocator, etc.) **Description:** LLVM provides custom memory allocators like "BumpPtrAllocator" for efficient allocation of many small objects. These allocators are optimized for specific use cases within the compiler. **Do This:** * Use "BumpPtrAllocator" for allocating small objects within a compilation unit or pass where memory can be freed all at once. * Consider using other LLVM allocators, such as "FoldingSetAllocator", for specialized data structures. * Be mindful of the lifetime of the allocator and the objects it allocates. **Don't Do This:** * Use "BumpPtrAllocator" for long-lived objects that need to be individually deallocated. * Mix allocations from different allocators without careful consideration of ownership. **Why This Matters:** * **Performance:** Custom allocators can be significantly faster than general-purpose allocators for certain use cases. * **Memory Efficiency:** Specialized allocators can reduce memory fragmentation and overhead. * **Integration:** LLVM allocators are designed to work seamlessly with the LLVM ecosystem. **Code Example:** """c++ #include "llvm/Support/Allocator.h" void allocateObjects(llvm::BumpPtrAllocator &allocator) { int *ptr1 = new (allocator.Allocate(sizeof(int), llvm::Align(sizeof(int)))) int(10); double *ptr2 = new (allocator.Allocate(sizeof(double), llvm::Align(sizeof(double)))) double(3.14); // ... use the allocated objects // All objects allocated from the BumpPtrAllocator are freed when the allocator is destroyed. } int main() { llvm::BumpPtrAllocator allocator; allocateObjects(allocator); return 0; // Allocator is destroyed here, freeing all allocated memory. } """ ## 4. Error Handling and Assertions ### 4.1. LLVM's Error Handling Mechanisms (llvm::Error, Expected<T>) **Description:** LLVM introduces "llvm::Error" and "llvm::Expected<T>" for explicit error handling. These mechanisms provide a structured way to represent and propagate errors, improving robustness and maintainability. Replacing the older error handling with "std::error_code" in certain areas. **Do This:** * Use "llvm::Error" to represent recoverable errors that can be handled by the caller. * Use "llvm::Expected<T>" to return either a value of type "T" or an "llvm::Error" indicating failure. * Propagate errors up the call stack using "llvm::Error" or by returning "llvm::Expected<T>". Check if a value has an error ("if (!expected_value)") before using it. **Don't Do This:** * Use exceptions for recoverable errors. LLVM largely avoids exceptions. * Ignore error codes or simply print error messages and continue execution. All errors should be handled and/or propagated. * Use raw bool for returning an error, instead prefer "llvm::Error". **Why This Matters:** * **Explicit Error Handling:** "llvm::Error" and "llvm::Expected<T>" make error handling explicit and visible in the code. * **Error Propagation:** These mechanisms ensure that errors are propagated up the call stack, allowing calling functions to handle them appropriately. * **Robustness:** Proper error handling improves the robustness of the system and prevents unexpected crashes. **Code Example:** """c++ #include "llvm/Support/Error.h" #include "llvm/Support/raw_ostream.h" llvm::Expected<int> divide(int a, int b) { if (b == 0) { return llvm::make_error<llvm::StringError>("Division by zero", llvm::inconvertibleErrorCode()); } return a / b; } int main() { llvm::Expected<int> result = divide(10, 2); if (result) { llvm::outs() << "Result: " << *result << "\n"; } else { llvm::outs() << "Error: " << llvm::toString(result.takeError()) << "\n"; } result = divide(5, 0); if(result){ llvm::outs() << "Result: " << *result << "\n"; } else { llvm::outs() << "Error: " << llvm::toString(result.takeError()) << "\n"; } return 0; } """ ### 4.2. Assertions for Internal Consistency **Description:** LLVM uses assertions extensively to check for internal consistency and preconditions. Assertions are used to catch programming errors during development and debugging. **Do This:** * Use "assert()" liberally to check for conditions that should always be true. * Provide informative error messages in assertions to help diagnose problems. * Disable assertions in production builds to avoid performance overhead. * Review and update assertions regularly as the code evolves. **Don't Do This:** * Use assertions to check for recoverable errors. Assertions should only be used for internal consistency checks. * Rely on assertions to prevent security vulnerabilities. Security checks should be implemented separately. * Leave dead code and dead assertions in the code. **Why This Matters:** * **Early Error Detection:** Assertions catch programming errors early in the development cycle, reducing debugging time. * **Code Clarity:** Assertions document the expected behavior of the code. * **Debugging Aid:** Assertions provide valuable information for diagnosing problems. **Code Example:** """c++ void processValue(int value) { assert(value >= 0 && "Value must be non-negative"); // ... use the value } """ ## 5. Code Formatting and Style ### 5.1. LLVM's Formatting Style (clang-format) **Description:** LLVM uses "clang-format" to enforce a consistent code formatting style. This ensures that all code in the LLVM project adheres to the same conventions. **Do This:** * Use "clang-format" to format all code before committing changes. * Install and configure "clang-format" to integrate with your editor or IDE. * Follow the LLVM coding standards as defined by "clang-format" which can be found on the LLVM website. * If you have code that you are not editing reformat it into a separate commit. **Don't Do This:** * Ignore "clang-format" warnings or manually format code. * Introduce style inconsistencies into the codebase. * Disable "clang-format" checks in the build system. **Why This Matters:** * **Consistency:** Consistent formatting makes code easier to read and understand. * **Collaboration:** Consistent formatting reduces conflicts and improves collaboration between developers. * **Automation:** "clang-format" automates the formatting process, saving time and effort. ### 5.2. Naming Conventions **Description:** LLVM follows well-defined naming conventions for variables, functions, classes, and other identifiers. **Do This:** * Use descriptive names that clearly indicate the purpose of the identifier. * Follow the LLVM naming conventions for different types of identifiers. (e.g. CamelCase for class names, snake_case for methods, etc.) * Use short names for variables with limited scopes (e.g., loop indices). **Don't Do This:** * Use cryptic or ambiguous names that are difficult to understand. * Violate the LLVM naming conventions. * Use overly long names, especially when the context is clear. """c++ // Example of naming conventions class MyClassName { // Class name: CamelCase public: void my_method_name() { // Method name: snake_case int loop_index = 0; // Variable name (short name for limited scope) // ... } }; """ ## 6. Concurrency and Thread Safety ### 6.1. Thread-Safety Considerations **Description:** LLVM is increasingly used in multi-threaded environments. Therefore, thread safety is a critical concern. **Do This:** * Identify and protect shared data structures with appropriate locking mechanisms. * Use fine-grained locking to minimize contention and improve performance. * Follow the LLVM synchronization primitives and best practices. * Prefer immutable data structures where possible to avoid synchronization issues. **Don't Do This:** * Introduce data races or other concurrency bugs. * Use global variables without proper synchronization. * Assume that code is thread-safe without explicit verification. **Why This Matters:** * **Correctness:** Thread safety ensures that code behaves correctly in multi-threaded environments. * **Performance:** Efficient synchronization mechanisms minimize performance overhead. * **Scalability:** Thread-safe code scales better on multi-core processors. ### 6.2. LLVM's Synchronization Primitives **Description:** LLVM provides synchronization primitives (e.g., "llvm::Mutex", "llvm::LockGuard") that are optimized for its specific needs. **Do This:** * Prefer LLVM's synchronization primitives over platform-specific primitives. * Use "llvm::LockGuard" to ensure that locks are always released, even in the presence of exceptions. * Document the locking strategy used to protect shared data structures. **Don't Do This:** * Use raw mutexes and condition variables without proper RAII wrappers. * Hold locks for extended periods of time, blocking other threads unnecessarily. * Introduce deadlocks by acquiring locks in inconsistent orders. **Code Example:** """c++ #include "llvm/Support/Threading.h" #include <mutex> class ThreadSafeCounter { private: int counter = 0; mutable llvm::Mutex mutex; public: void increment() { std::lock_guard<llvm::Mutex> lock(mutex); counter++; } int getCount() const { std::lock_guard<llvm::Mutex> lock(mutex); return counter; } }; """ ## 7. Optimization and Performance ### 7.1. Code Profiling and Benchmarking **Description:** LLVM provides tools for profiling and benchmarking code to identify performance bottlenecks. **Do This:** * Use LLVM's profiling tools to identify hot spots in the code. * Create microbenchmarks to measure the performance of specific algorithms and data structures. * Use regression tests to ensure that performance improvements are not lost over time. **Don't Do This:** * Make performance optimizations without measuring their impact. * Ignore performance regressions in the test suite. * Optimize prematurely without identifying the real bottlenecks. ### 7.2. Data Structure Choices **Description:** The choice of data structures can have a significant impact on performance. **Do This:** * Choose data structures that are appropriate for the access patterns and data sizes. * Use efficient data structures for frequently accessed data. * Consider using specialized data structures provided by LLVM (e.g., "SmallVector"). **Don't Do This:** * Use inefficient data structures without considering the performance implications. * Rely on default data structures without performance testing. **Code Example:** """c++ #include "llvm/ADT/SmallVector.h" void processValues(llvm::SmallVector<int, 16> &values) { // Use SmallVector for small vectors that are frequently accessed for (int value : values) { // ... } } """ This comprehensive coding standards document serves as a guide for LLVM developers, helping to ensure consistency, maintainability, performance, and security within the LLVM project. By adhering to these guidelines, developers can contribute to a high-quality codebase that meets the needs of the LLVM community. This document provides specific and detailed instructions that are meant to be used by AI coding assistants.