string_field()function

Create a string column specification.

USAGE

string_field(
    min_length=None,
    max_length=None,
    pattern=None,
    preset=None,
    allowed=None,
    nullable=False,
    null_probability=0.0,
    unique=False,
    generator=None,
)

Parameters

min_length : int | None = None

Minimum string length. Default is None (no minimum).

max_length : int | None = None

Maximum string length. Default is None (no maximum).

pattern : str | None = None

Regular expression pattern for generated strings.

preset : str | None = None

Preset for realistic data (e.g., "email", "name", "phone_number").

allowed : list[str] | None = None

List of allowed values (categorical constraint).

nullable : bool = False

Whether the column can contain null values. Default is False.

null_probability : float = 0.0

Probability of generating null when nullable=True. Default is 0.0.

unique : bool = False

Whether all values must be unique. Default is False.

generator : Callable[[], Any] | None = None

Custom callable that generates values. Overrides other settings.

Returns

StringField

A string field specification.

Examples


Define a schema with string fields and generate test data:

import pointblank as pb

# Define a schema with string field specifications
schema = pb.Schema(
    name=pb.string_field(preset="name"),
    email=pb.string_field(preset="email", unique=True),
    status=pb.string_field(allowed=["active", "pending", "inactive"]),
    code=pb.string_field(pattern=r"[A-Z]{3}-[0-9]{4}"),
)

# Generate 100 rows of test data
pb.preview(pb.generate_dataset(schema, n=100, seed=23))
PolarsRows100Columns4
name
String
email
String
status
String
code
String
1 Vivienne Rios vivienne.rios@gmail.com pending CAS-6685
2 William Schaefer williamschaefer@aol.com active XGI-0397
3 Lily Hansen lilyhansen@hotmail.com active DCW-6086
4 Shirley Mays shirley.mays27@aol.com inactive YBG-9529
5 Sean Dawson sean.dawson29@aol.com pending XLS-9459
96 Kathryn Green kathryn.green@hotmail.com active THG-2900
97 Daniel Morris dmorris@yahoo.com inactive CHC-3681
98 William Cooper williamcooper@protonmail.com inactive HKT-3552
99 Lane Sawyer l_sawyer@zoho.com active OEW-4157
100 Paisley Sandoval paisley_sandoval@gmail.com pending FSX-8948

The generated data will have coherent names and emails (derived from the name), statuses sampled from the allowed values, and codes matching the regex pattern.