Overview and Concepts#

This section provides an overview of what a code generator is, a high level overview of how to build a code generator, and introduces the concepts that are used when building a Smithy code generator.

What you're building#

You're building one or more Smithy-Build codegen plugins to generate a client, server, and types from Smithy models for a target environment. A target environment is the intended programming language and specific runtime environment.

Smithy-Build is a tool used to transform Smithy models into artifacts like code or other models. Smithy-Build plugins are implemented to perform these transformations. Codegen plugins are configured with smithy-build.json files, implemented in Java, Kotlin, or another JVM language, and discovered on the classpath through Java Service Provider Interfaces (SPI).

Consider the following smithy-build.json file:

smithy-build.json#

{
   "version": "1.0",
   "plugins": {
       "foo-client-codegen": {
           "service": "smithy.example#Weather",
           "package": "com.example.weather",
           "packageVersion": "0.0.1",
           "edition": "2022"
       }
   }
}

This file tells Smithy-Build to generate a hypothetical Foo language client using the foo-client-codegen plugin found on the classpath.

Design first, generate second#

The first step to writing a code generator is to not write the code generator. The code generator is an implementation detail. First, you need to decide on what code you want to generate. Pick a Smithy model and manually map each concept of the model to hand-written code. In fact, nearly every aspect of the product you intend to eventually generate can be hand-written as a proof of concept before writing any of the code generator. Things to consider are:

How will Smithy types map to types in your programming language?
What will the client and server interfaces look like for a modeled service?
How are clients and servers created and configured?
How will you allow the client or server to be customized at runtime?

Design documents#

You are encouraged to document major design decisions to explain why design choices were made and leave a record for future contributors.

Example Smithy codegen design documents:

Phases of code generation#

There are three phases of code generation.

Codegen-time: The phase in which code is being generated for the target environment.
- Depends on Smithy models
- Typically written in Java if using the Smithy reference implementation
- Uses Smithy-Build
- Generates code
- Generates dependencies
Compile-time: Performed in the target environment to compile and/or verify generated code. This phase may be optional for languages that aren't compiled. However, linting and static analysis are also considered part of the compile-time phase.
Runtime: Generated code is run in the target environment. The Smithy model, Java, and Smithy reference implementation are not required at runtime.

Runtime libraries#

The libraries and code that are used to power a client, server, serialization, and deserialization are called runtime libraries. The code generator needs to have prior knowledge of these libraries and how to call them. The code generated by a code generator is expected to automatically work based on the Smithy model the code was generated from. For example, if the model contains a service shape marked with the aws.auth#sigv4 trait for auth, then the generated code should be configured to use AWS Signature version 4 and have a dependency on any necessary libraries for the target environment.

Deciding on the libraries you use, which dependencies you take, and what public interfaces you expose is part of the design phase of both the generator and runtime libraries. The runtime libraries can be designed separately from the code generator, but there does need to be some consideration given to how a code generator will configure and compose runtime components at codegen-time.

You don't need Smithy models at runtime#

Smithy code generators should utilize model-ignorant code generation, a method of generating code that does not require the models the code was generated from to be available at runtime. This makes the Smithy model itself an implementation detail to the generated code, and it removes the need to write a Smithy implementation in the target environment. Code generated from Smithy models does not need the Smithy model at runtime because things like routing, serialization, deserialization, and orchestration can all be generated at codegen-time. If any elements of the Smithy model need to be made available at runtime, they can be made available using other language-specific mechanisms like Java annotations, Rust attributes, interfaces, etc.

Client, server, and type code generation#

Smithy code generators should be able to generate clients, servers, and types. Each of these use cases should be served by a different smithy-build.json plugin, though they should all rely on a shared implementation. For example, here's how service code generation could be configured for a Java code generator:

smithy-build.json#

{
   "version": "1.0",
   "projections": {
       "source": {
           "plugins": {
               "java-server-codegen": {
                   "service": "com.bigco.example#Example",
                   "package": "com.bigco.example",
                   "packageVersion": "0.0.1",
                   "edition": "2022"
               }
           }
       }
   }
}

Client generation#

All Smithy implementations should generate clients.

This is where most code generators should start.
Clients generated from a model should not use the exact same types and interfaces as a service generated from a model. This is because (1) many Smithy services use projections to generate clients, and the projections often have features removed that are internal-only or available to a subset of customers. (2) servers are authoritative; they have perfect knowledge of the model and can generate stricter types. Clients are non-authoritative and need to guard against model updates that are considered backward compatible (for example, adding a new enum member).
AWS SDKs are built on top of Smithy clients, but Smithy clients are not AWS SDKs. Smithy clients do not require the use of AWS protocols, signing algorithms, regions and endpoint resolution, ~/.aws/config, etc (note that Smithy does not support a first-party protocol today, so in practice most clients will likely rely on an AWS protocol like aws.protocols#restJson1).

Server generation#

Some Smithy code generators will generate service framework code. This can include service interfaces, stubs to implement each operation, request deserializers, response serializers, etc.

If you know that your language will also provide a service framework, it's best to start the service development while the clients are being developed. This helps to ensure that a high degree of code can be shared across the generators.
Even if you don't plan on writing a service right now, it does help to think about how a service code generator could be added in a way that can reuse much of the client code generation.
When adding features to generated types and interfaces, consider if the feature is applicable to both client and server code. If it isn't, then the feature should either be removed, refactored, or added in such a way that it is only optionally generated for clients.

Type generation#

Smithy code generators can generate standalone types. For example, this would happen when a service has no operations or resources but only shapes bound to the service via the (upcoming) shapes property.

Generation of types should still require a service shape that is used to create a closure of shapes.
The service shape dictates the serialization formats supported by the generated types using protocol traits.