Prompt Template

Prompt templates guide the model’s response generation. This use case demonstrates setting up FlexFlow Serve to integrate with Langchain and using prompt templates to handle dynamic prompt templates.


  • FlexFlow Serve setup with appropriate configurations.

  • Langchain integration with templates for prompt management.


  1. FlexFlow Initialization Initialize and configure FlexFlow Serve.

  2. LLM Setup Compile and start the server for text generation.

  3. Prompt Template Setup Setup a prompt template for guiding model’s responses.

  4. Response Generation Use the LLM with the prompt template to generate a response.

  5. Shutdown Stop the FlexFlow server after generating the response.


Complete code example can be found here:

  1. Prompt Template Example with incremental decoding

  2. Prompt Template Example with speculative inference

Example Implementation:

import flexflow.serve as ff
from langchain.prompts import PromptTemplate

ff_llm = FlexFlowLLM(...)

template = "Question: {question}\nAnswer:"
prompt = PromptTemplate(template=template, input_variables=["question"])

response = ff_llm.generate("Who was the US president in 1997?")