Flash provides two ways to execute remote functions on serverless endpoints: queue-based (QB) and load-balanced (LB) endpoints. This guide covers using the @remote decorator with load-balanced endpoints for HTTP-based function execution.
Queue-Based Endpoints (ServerlessEndpoint, LiveServerless)
- Requests queued and processed sequentially
- Automatic retry logic on failure
- Built-in fault tolerance
- Higher latency (queuing + processing)
- Fixed request/response format
Load-Balanced Endpoints (LoadBalancerSlsResource, LiveLoadBalancer)
- Requests routed directly to available workers
- Direct HTTP execution, no queue
- No automatic retries
- Lower latency (direct HTTP)
- Custom HTTP routes and methods
Use Load-Balanced when you need:
- Low latency API endpoints
- Custom HTTP routing (GET, POST, PUT, DELETE)
- Direct HTTP response handling
- Handling multiple routes on single endpoint
Use Queue-Based when you need:
- Automatic retry logic on failures
- Sequential, fault-tolerant processing
- Tolerance for higher latency
- Simple request/response pattern
For local development, use LiveLoadBalancer:
from runpod_flash import LiveLoadBalancer, remote
# Create load-balanced endpoint
api = LiveLoadBalancer(name="example-api")
# Define HTTP-routed function
@remote(api, method="POST", path="/api/greet")
async def greet_user(name: str):
return {"message": f"Hello, {name}!"}
# Call the function locally
async def main():
result = await greet_user("Alice")
print(result) # {"message": "Hello, Alice!"}
if __name__ == "__main__":
import asyncio
asyncio.run(main())Key points:
methodparameter specifies HTTP method (GET, POST, PUT, DELETE, PATCH)pathparameter specifies URL route (must start with/)- Functions execute directly without deployment during development
Load-balanced endpoints require explicit HTTP routing metadata in the @remote decorator.
method (required for LoadBalancerSlsResource)
- Must be one of: GET, POST, PUT, DELETE, PATCH
- Case-insensitive (POST, post, Post all work)
path (required for LoadBalancerSlsResource)
- Must start with
/(e.g.,/api/process,/health) - Can include path parameters (e.g.,
/api/users/{user_id}) - Cannot use reserved paths:
/execute,/ping
Multiple functions can share a single LoadBalancerSlsResource with different routes:
from runpod_flash import LiveLoadBalancer, remote
api = LiveLoadBalancer(name="user-service")
@remote(api, method="GET", path="/users")
def list_users():
return {"users": []}
@remote(api, method="POST", path="/users")
async def create_user(name: str, email: str):
return {"id": 1, "name": name, "email": email}
@remote(api, method="GET", path="/users/{user_id}")
def get_user(user_id: int):
return {"id": user_id, "name": "Alice"}
@remote(api, method="DELETE", path="/users/{user_id}")
async def delete_user(user_id: int):
return {"deleted": True}When deployed:
- Single
user-serviceendpoint created - Four HTTP routes registered automatically
- FastAPI handles routing to correct function
The following paths are reserved by Flash and cannot be used as user-defined routes:
/ping- Health check endpoint (required, returns 200 OK)
Important Security Note:
/execute- Framework endpoint for @remote stub execution (only available with LiveLoadBalancer for local development)- Deployed
LoadBalancerSlsResourceendpoints deliberately do NOT expose/executefor security reasons - The
/executeendpoint accepts and executes arbitrary Python code - exposing it would allow remote code execution - When using deployed endpoints, @remote calls are safely translated to HTTP requests to your user-defined routes
- Never manually add
/executeto deployed endpoints
- Deployed
Attempting to use /ping or /execute as user-defined routes will raise a validation error at build time.
For local development and testing, use LiveLoadBalancer instead of LoadBalancerSlsResource:
from runpod_flash import LiveLoadBalancer, remote
api = LiveLoadBalancer(name="my-api")
@remote(api, method="POST", path="/api/process")
async def process_data(x: int, y: int):
return {"result": x + y}
# In tests or scripts, call directly
async def test():
result = await process_data(5, 3)
assert result == {"result": 8}Key differences:
LiveLoadBalancerlocks image to Flash LB runtime (runpod/flash-lb)- Functions execute directly without deployment
- Ideal for development and CI/CD testing
- Same
@remotedecorator interface as production
import pytest
from runpod_flash import LiveLoadBalancer, remote
api = LiveLoadBalancer(name="test-api")
@remote(api, method="POST", path="/api/calculate")
async def calculate(operation: str, a: int, b: int):
if operation == "add":
return a + b
elif operation == "multiply":
return a * b
else:
raise ValueError(f"Unknown operation: {operation}")
@pytest.mark.asyncio
async def test_calculate_add():
result = await calculate("add", 5, 3)
assert result == 8
@pytest.mark.asyncio
async def test_calculate_multiply():
result = await calculate("multiply", 5, 3)
assert result == 15
@pytest.mark.asyncio
async def test_calculate_invalid():
with pytest.raises(ValueError):
await calculate("unknown", 5, 3)The behavior of @remote decorated functions differs between local development and deployed endpoints:
When using LiveLoadBalancer for local testing:
- Functions decorated with
@remoteserialize their code and POST to/executeendpoint - The
/executeendpoint accepts and executes the serialized function code - Useful for development and CI/CD testing before deployment
When deployed to production:
- Functions decorated with
@remoteare called via HTTP requests to their user-defined routes - The stub automatically translates
@remotecalls into HTTP requests with mapped parameters - Example:
await process_data(5, 3)becomesPOST /api/process {"x": 5, "y": 3}
When migrating code from local testing to production:
- No code changes needed -
@remotedecorated functions work the same way - The stub automatically detects whether it's
LiveLoadBalancer(local) orLoadBalancerSlsResource(deployed) - User-defined routes must be compatible with JSON serialization for parameters
Parameter Type Constraints on Deployed Endpoints:
When using deployed LoadBalancerSlsResource endpoints, function parameters are serialized to JSON in the HTTP request body. This means:
Supported types:
- Primitive types:
int,str,bool,float - Collections:
list,dict,tuple,set - Nested structures:
list[dict[str, int]], etc. - Optional types:
Optional[str],Optional[int] - Special:
None
Unsupported types:
- Custom classes and dataclasses
- Request objects (FastAPI Request, Starlette Request)
- File/binary objects
- Complex Python objects that can't serialize to JSON
- Datetime objects (without custom serialization)
Example of parameter mapping:
# Local call:
result = await process_data(5, "hello", [1, 2, 3])
# Gets translated to deployed endpoint call:
POST /api/process
{
"x": 5,
"name": "hello",
"items": [1, 2, 3]
}If you need to use complex types (e.g., File uploads, custom objects), use direct HTTP calls instead of the @remote decorator for deployed endpoints. For local development with LiveLoadBalancer, complex types work because the entire function is serialized and executed.
When you run flash build, the system:
- Scans your code for
@remotedecorated functions - Extracts HTTP routing metadata (method, path)
- Creates manifest with route registry
- Validates routes for conflicts and reserved paths
- Packages everything for deployment
The runtime then:
- Loads the manifest and route registry
- Creates a FastAPI application with the registered routes
- Starts the server on port 8000
Your @remote decorated functions are discovered and registered automatically based on their HTTP method and path parameters.
# 1. Define functions with @remote decorator in your code
# 2. Test locally with LiveLoadBalancer
# 3. Build for production
flash build
# 4. Configure your endpoint (optional)
# Edit flash.toml if needed to set image, GPU, etc.
# 5. Deploy
flash deploy
# 6. Check deployment status
flash statusOnce deployed, verify your endpoint:
# Check endpoint is healthy
curl https://<endpoint-url>/ping
# Expected response: {"status": "healthy"}
# Call your function via HTTP
curl -X POST https://<endpoint-url>/api/process \
-H "Content-Type: application/json" \
-d '{"x": 5, "y": 3}'Here's a full example with multiple routes, error handling, and testing:
"""
user_service.py - Example load-balanced API service
"""
from runpod_flash import LoadBalancerSlsResource, remote
from typing import Optional
# For production, use LoadBalancerSlsResource
# For local development, use LiveLoadBalancer
api = LoadBalancerSlsResource(
name="user-service",
imageName="runpod/runpod-flash-lb:latest"
)
class UserNotFound(Exception):
pass
# In-memory database for example
users_db = {
1: {"id": 1, "name": "Alice", "email": "alice@example.com"},
2: {"id": 2, "name": "Bob", "email": "bob@example.com"},
}
@remote(api, method="GET", path="/health")
def health_check():
"""Health check endpoint."""
return {"status": "healthy"}
@remote(api, method="GET", path="/users")
def list_users():
"""List all users."""
return {"users": list(users_db.values())}
@remote(api, method="POST", path="/users")
async def create_user(name: str, email: str):
"""Create a new user."""
user_id = max(users_db.keys() or [0]) + 1
user = {"id": user_id, "name": name, "email": email}
users_db[user_id] = user
return user
@remote(api, method="GET", path="/users/{user_id}")
def get_user(user_id: int):
"""Get a specific user."""
if user_id not in users_db:
raise UserNotFound(f"User {user_id} not found")
return users_db[user_id]
@remote(api, method="PUT", path="/users/{user_id}")
async def update_user(user_id: int, name: Optional[str] = None,
email: Optional[str] = None):
"""Update a user."""
if user_id not in users_db:
raise UserNotFound(f"User {user_id} not found")
user = users_db[user_id]
if name is not None:
user["name"] = name
if email is not None:
user["email"] = email
return user
@remote(api, method="DELETE", path="/users/{user_id}")
async def delete_user(user_id: int):
"""Delete a user."""
if user_id not in users_db:
raise UserNotFound(f"User {user_id} not found")
del users_db[user_id]
return {"deleted": True}"""
test_user_service.py
"""
import pytest
from runpod_flash import LiveLoadBalancer, remote
from typing import Optional
# Use LiveLoadBalancer for testing
api = LiveLoadBalancer(name="user-service-test")
# Define functions (same as above but use test endpoint)
# ... (function definitions) ...
@pytest.mark.asyncio
async def test_list_users():
users = list_users()
assert "users" in users
assert isinstance(users["users"], list)
@pytest.mark.asyncio
async def test_create_and_get_user():
# Create a user
new_user = await create_user("Charlie", "charlie@example.com")
assert new_user["name"] == "Charlie"
assert new_user["id"] > 0
# Get the user
user = get_user(new_user["id"])
assert user["name"] == "Charlie"
@pytest.mark.asyncio
async def test_update_user():
new_user = await create_user("Diana", "diana@example.com")
updated = await update_user(new_user["id"], name="Diana Updated")
assert updated["name"] == "Diana Updated"
@pytest.mark.asyncio
async def test_delete_user():
new_user = await create_user("Eve", "eve@example.com")
result = await delete_user(new_user["id"])
assert result["deleted"] is True
# Should raise error when trying to get deleted user
with pytest.raises(Exception): # UserNotFound
get_user(new_user["id"])"requires both 'method' and 'path'"
- Problem: Using
@remote(lb_resource)without method/path - Solution: Add both parameters:
@remote(lb, method="POST", path="/api/endpoint")
"Invalid HTTP method 'PATCH' must be one of: GET, POST, PUT, DELETE, PATCH"
- Problem: Typo in HTTP method (e.g.,
PTACHinstead ofPATCH) - Solution: Verify method spelling matches valid HTTP verbs
"path must start with '/'"
- Problem: Path doesn't start with forward slash
- Solution: Use absolute paths:
/api/endpointnotapi/endpoint
"Route conflict detected: POST /api/process defined twice"
- Problem: Two functions with same method and path on same endpoint
- Solution: Change path or method to make each route unique
"Endpoint URL not available - endpoint may not be deployed"
- Problem: Using LoadBalancerSlsResource before calling
await resource.deploy() - Solution: Deploy the endpoint first (
await resource.deploy()) which auto-populates endpoint_url, or use LiveLoadBalancer for local testing - Note: endpoint_url is auto-generated by Runpod after deployment and cannot be manually specified
"HTTP error from endpoint: 500"
- Problem: Function raised an error during execution
- Solution: Check function code for exceptions, view endpoint logs
"Execution timeout on user-service after 30s"
- Problem: Function took longer than 30 seconds to complete
- Solution: Optimize function, consider increasing timeout in LoadBalancerSlsStub
"404 Not Found" or "404 error" when calling @remote on deployed endpoint
- Problem: Function decorated with @remote but missing
methodand/orpathparameters - Solution: Always provide complete routing metadata:
@remote(api, method="POST", path="/api/endpoint") - Note: On
LoadBalancerSlsResource, the stub will try to use the non-existent/executeendpoint if routing metadata is missing
"JSON serialization error" or "unexpected keyword argument" on deployed endpoint
- Problem: Deployed endpoint receiving malformed parameters from @remote call
- Solution: This should not happen automatically (stub handles parameter mapping). Check:
- Function parameters are JSON-serializable (not custom classes or Request objects)
- Function signature matches expected parameter names
- For complex types, make direct HTTP calls instead of using @remote
"Cannot import module 'user_service'"
- Problem: Function module not found at runtime
- Solution: Ensure module is in Python path, check import statements
"Function 'process_data' not found in executed code"
- Problem: Function source extraction failed
- Solution: Ensure function is defined at module level (not inside another function)
@remote(
resource_config: LoadBalancerSlsResource | LiveLoadBalancer,
method: str = None, # Required: GET, POST, PUT, DELETE, PATCH
path: str = None, # Required: /api/route
dependencies: List[str] = None, # Python packages to install
system_dependencies: List[str] = None, # System packages to install
accelerate_downloads: bool = True # Use download acceleration
)
def your_function(...):
passSee docs/Load_Balancer_Endpoints.md for detailed architecture and configuration options.
A test/development variant of LoadBalancerSlsResource:
- Locks to Flash LB image
- Enables direct function calls without deployment
- Same decorator interface as production
- Use LiveLoadBalancer for testing - No deployment needed for development
- Test locally before deploying - Catch routing/logic errors early
- Use descriptive paths -
/api/users/{user_id}is clearer than/api/u - Group related routes - Keep similar endpoints on same service
- Handle errors gracefully - Return meaningful error messages to clients
- Verify health checks - Ensure
/pingendpoint works after deployment - Document your API - Add docstrings explaining what each route does
- Review
docs/Load_Balancer_Endpoints.mdfor LoadBalancerSlsResource class architecture - Review
docs/LoadBalancer_Runtime_Architecture.mdfor runtime execution and request flows - Check examples in
flash-examples/repository for more patterns - Use
flash build --helpto see build options - Use
flash run --helpto see local testing options