Cendol is a C23 compiler implemented in Rust. It is a project to understand the process of building a compiler from scratch, focusing on high-performance compiler architecture and comprehensive C23 standard compliance.
- Full C23 Preprocessor: Complete preprocessor with macro expansion, conditional compilation, file inclusion, and built-in macros (
__FILE__,__LINE__, etc.) - Lexer: Tokenization of C23 source code with proper handling of literals, keywords, and operators
- Parser: Comprehensive C23 syntax parsing using Pratt parsing for expressions and recursive descent for statements
- Semantic Analysis: Type checking, symbol resolution, and semantic validation
- Code Generation: Compiles to native object code using Cranelift backend
- Linker Integration: Automatic invocation of system linker (clang) to produce executables
- Rich Diagnostics: Error reporting with source location tracking
- No Trigraph Support: Trigraphs (three-character sequences like
??=,??<, etc.) are not supported. Note: Trigraphs were officially removed in C23. - No Digraph Support: Digraphs (two-character sequences like
<:,:>,<%,%>,%:,%:%:) are not supported. This compiler targets modern C and does not implement legacy digraph tokens. - No K&R Function Declarations: In compliance with C23, functions declared with an empty parameter list (e.g.,
int foo()) are treated as having no parameters (equivalent toint foo(void)). - Limited Inline Assembly Support: Inline assembly (using
asmor__asm__keywords) has only limited support depending on the Cranelift backend capabilities.
Cendol follows a traditional multi-phase compiler architecture optimized for performance:
- Preprocessing Phase: Transforms C source with macro expansion and includes
- Lexing Phase: Converts preprocessed tokens to lexical tokens
- Parsing Phase: Builds a flattened Abstract Syntax Tree (AST)
- Semantic Analysis Phase: Performs type checking and symbol resolution
- MIR Generation: Lowers AST to Mid-level Intermediate Representation
- Code Generation: Generates native machine code via Cranelift
- Linking: Links object files to create the final executable
- Rust 2024 edition or later
- Cargo
- Clang (used as the system linker)
To build the compiler, run:
cargo buildFor release build with optimizations:
cargo build --releaseTo compile a C file to an executable:
cargo run -- -o <output_file> <input_file>-E: Preprocess only, output preprocessed source to stdout-P: Suppress line markers in preprocessor output-C: Retain comments in preprocessor output-I <path>: Add include search path-D <name>[=<value>]: Define preprocessor macro--verbose: Enable verbose diagnostic output
Preprocess a file:
cargo run -- -E test.cDefine macros and include paths:
cargo run -- -D DEBUG=1 -I /usr/include test.cComprehensive design documentation is available in the design-document/ directory:
- Main Architecture - Overall compiler design and goals
- Preprocessor Design - Preprocessing phase details
- Lexer Design - Tokenization strategy
- Parser Design - AST construction
- Semantic Analysis - Type checking and validation
This is a learning project, but contributions are welcome! Areas of interest include:
- Additional C23 language features
- Performance optimizations
- Testing and bug fixes
- Documentation improvements
This project is AI-friendly and welcomes contributions from developers using AI tools. We encourage the use of AI for code generation, debugging, and documentation to enhance productivity.
See LICENSE file for details.