Filling Sources: Attachments & Text Input
Attach any document, type what you know, write what you need — the AI reads everything together and fills your form
Overview
Every form filling session starts with the question: where is the data? Sometimes it's in a PDF. Sometimes it's in a photo of an insurance card. Sometimes you have an Excel export, a scanned contract, and a name you know off the top of your head that isn't in any file. Sometimes you also need to tell the AI something — "skip section 3," "use the business address not the personal one," "write N/A for anything you can't find."
Instafill.ai handles all of this through two input channels that work simultaneously in the same session: file attachments and a text input field. Attach any combination of supported documents. Type anything you know or any instruction you want followed. The AI reads both together — it doesn't separate "here's data" from "here's instructions." One field, one natural language interface, unlimited flexibility.
Both channels feed into the same source pool. When autofill runs, everything — file-extracted text and typed text — is passed to the AI field-filling prompts simultaneously. Profile sources selected for the session also merge into the same pool. The AI determines what is a data value to extract and what is an instruction to follow, without you needing to separate them.
Supported File Types for Attachment
Eight file formats are accepted as source attachments. There is no need to convert files before uploading — each format is handled natively.
Documents
PDF (.pdf)
The primary document format for form filling. The system accepts all PDF varieties: fillable AcroForms, flat/scanned PDFs, and image-based PDFs where the content is embedded as images rather than text. Text is extracted per-page using OCR and AI vision, watermarks are removed before extraction, and a page-level mapping is computed for large files (over 20,000 tokens) so the AI only receives source content relevant to the form page being filled at any moment.
Word documents (.docx, .doc)
Modern and legacy Word formats are both supported. Text is extracted from the file's structure, producing clean content that the AI receives without conversion artifacts. See Word Document Filling for full detail on Word file support.
Plain text (.txt)
Loaded directly. Useful for exporting from systems that produce text output, for pasting structured data into a file, or for any situation where the relevant information is already in plain text.
Spreadsheets and Tabular Data
Excel (.xls, .xlsx)
Loaded via the platform's SourceLoader, which handles both legacy .xls and modern .xlsx formats. Useful for batch-adjacent single-session scenarios where the source data is in a spreadsheet row — attach the Excel file and specify in the text input which row or record to use.
CSV (.csv)
Loaded as delimited text. Common for data exports from CRMs, HR systems, insurance platforms, and government databases. The AI reads column headers alongside values to understand field context.
Images and Scans
PNG, JPG, JPEG (.png, .jpg, .jpeg)
Photos of documents, screenshots, scanned images, and photos taken with a phone camera are all processed via vision AI. This covers insurance cards photographed at point of service, signed documents scanned to image, ID documents photographed for verification, and any situation where the user has a photo rather than a digital document. Image quality affects extraction accuracy — 300 DPI or higher produces the best results, but lower-quality images are processed with best-effort extraction.
Text Input: Data and Instructions in the Same Field
The text input is not a search box and not a structured data form. It is a natural language field where you write whatever is relevant to filling the form — data, instructions, or both in the same message — and the AI parses it.
As a Data Channel
Type information you know that isn't in any attached document:
- "The client's date of birth is March 12, 1985. They have two dependents. They are not a US citizen."
- "NPI number is 1234567890. The practice address is 400 Main St, Suite 12, Boston MA 02101."
- "Employee start date: February 3, 2026. Department: Revenue Cycle. Reports to: Sarah Chen."
The AI extracts these values and maps them to the appropriate form fields exactly as it would from a PDF source. Typed data fills gaps that uploaded documents don't cover — the client detail you know from memory, the number from a system you can't export, the piece of information that exists only in your head.
As an Instruction Channel
Type directives for how the AI should handle the fill:
- "Use formal language throughout. Spell out all abbreviations."
- "Leave Section 4 blank — it will be completed by the client."
- "If you cannot find a value, use N/A rather than leaving the field empty."
- "Use the business address, not the personal address, wherever address is requested."
- "Fill dates in DD/MM/YYYY format."
- "The applicant is applying as an individual, not an organization."
These instructions are part of the same source text the AI reads when filling each field. There is no separate "instructions" field and no special syntax required — write them in plain English alongside data, or on their own.
As Both at Once
The most common use is mixed: some sentences are data, some are instructions. The AI handles both:
"Client name: Maria Vasquez. DOB: 07/14/1972. This form is for the 2025 policy year. For any field about prior claims, the answer is No. Use the mailing address from the attached PDF, not the property address."
The AI extracts "Maria Vasquez" as a name value, "07/14/1972" as a date, "2025" as the policy year, "No" as the prior claims value, and follows the address instruction — all from a single block of text that required no structuring or formatting from the user.
How Sources Are Processed Together
Session Source Pool
At session creation, all inputs merge into a single combined source pool:
- File attachments: Each uploaded file is stored with its extracted text, encrypted with workspace-scoped keys
- Text input: The typed content is stored encrypted, alongside the file entries
- Profile sources: If profiles are selected for the session, their files and text entries also join the same pool
The text input is encrypted using workspace-scoped keys before storage, the same as file content.
Retrieval at Fill Time
When autofill runs for a field group, all source text — both typed and uploaded — is passed to the AI filling prompt together. The AI receives the full content of every source simultaneously. It doesn't apply file sources first and then check the text input; all source content informs every field fill.
Page-Level Mapping for Large Files
For file sources that exceed 20,000 tokens, the system computes a page-level mapping. When filling fields on page 5 of a 20-page form, only source text identified as relevant to page 5 reaches the AI prompt — rather than the full document. This keeps context focused and extraction accurate across long source documents without truncating content.
Typed text input is always included in full with every field group fill, since instructions and supplementary data typed by the user apply across the entire form.
Use Cases
Intake and onboarding with mixed document types: A healthcare intake coordinator opens a patient's prior visit records (PDF), their insurance card photo (JPEG), and types: "Patient prefers not to be contacted by phone. Primary language: Spanish. Referring physician is Dr. R. Patel, NPI 9876543210." The AI fills the intake form using all three sources simultaneously — clinical data from the PDF, insurance details from the photo, preferences and referral information from the typed text.
Legal forms with instructions for sensitive fields: A paralegal fills a court-filing form by attaching the client matter file (PDF) and typing: "Do not include the client's home address — use the firm's address for all address fields. Leave the case number blank, it will be assigned at filing. Describe the matter as 'breach of contract' throughout." The PDF provides party names, dates, and facts; the text input provides the address instruction and the matter description override.
Insurance applications with Excel data: A commercial lines broker attaches an Excel export from the agency management system (one row per location) and types: "Use row 3 — that's the primary location. The business has been in operation since 2008. No prior claims in the last 5 years." The AI reads the spreadsheet row for location address, revenue, and employee count, and fills prior claims and operating history from the typed text.
Credentialing with photo ID: A credentialing specialist doesn't have the physician's DEA certificate as a PDF — only a photo taken at an in-person meeting. They attach the JPEG alongside the physician's CV (PDF) and their malpractice certificate (PDF), and type: "DEA registration expires 04/30/2027. Use Dr. Torres's hospital affiliation, not the private practice address, for section 6." The AI extracts DEA number from the photo, biographical and training data from the CV, and malpractice limits from the certificate, while following the address instruction.
Government forms with knowledge-only data: A compliance officer fills a state regulatory form without any document for a specific field — they simply know the value. They attach the company's prior filing (PDF) for most fields and type: "Current FEIN: 83-4521000. The company changed its name on January 1, 2026 — use 'Acme Holdings LLC' as the legal name, not 'Acme Corp' as shown in the prior filing. The registered agent is still the same." The AI uses the current-year name from the text and the prior filing for everything else.
Benefits
- No format lock-in: Whatever format the information arrived in — scan, spreadsheet, photo, Word document, email text you paste in — the AI reads it. Users don't convert or restructure before uploading.
- Typed knowledge counts as a source: Information that exists only in the user's head — a number they know, a detail from a phone call, a prior arrangement not in any document — can be typed directly and used as data for field filling.
- Instructions and data in one place: There is no separate "instructions" panel, no configuration screen for overrides, no special syntax. Write what you want in natural language in the same field as the data, and the AI follows both.
- Large files don't lose resolution: Page-level mapping for sources over 20,000 tokens means long documents don't get truncated — the AI receives the right section of a large file for each part of the form.
- Sources compound: Three PDFs + an Excel file + typed supplementary data + a profile all contribute to one fill. Each source covers what the others don't, and the AI resolves conflicts by flagging them for review rather than silently choosing.
- Everything is encrypted: Typed text is encrypted before storage using workspace-scoped keys. File content is encrypted in Azure Blob Storage. Source data never persists unencrypted.
Security & Privacy
All source inputs — both file attachments and text input — are encrypted and workspace-scoped:
- Text input encryption: Typed content is encrypted with workspace-scoped keys before being stored in the session document.
- File storage encryption: Uploaded files are stored in Azure Blob Storage with workspace-scoped encryption. The session stores the Azure URL reference, not the file content directly.
- Workspace isolation: All source entries in
session.sources[]are scoped to the workspace. JWT middleware prevents cross-workspace access to any session's source content. - Access control: Sources are accessible only to users with permissions on the originating session. Neither file content nor typed text is visible outside the session's workspace.
- No AI training: Attached files and typed text input are used only for the specific form filling session. They are never used to train or fine-tune AI models.
- Configurable retention: Source documents and session text follow the workspace retention policy. Stateless Mode (available in Autofill from Sources) deletes all source content immediately after the session completes — useful for highly sensitive documents where zero post-session retention is required.
Common Questions
What is the maximum file size for attachments?
The maximum file size is set at the workspace level. The default matches standard PDF size limits. For very large files — multi-hundred-page documents, large Excel files — contact support to discuss workspace-level limits.
Note that files over 20,000 tokens trigger page-level mapping, which adds a processing step but does not prevent the file from being used. Very large files may have slightly longer source processing time before autofill becomes available.
Can I attach more than one file to a single session?
Yes — there is no enforced limit on the number of files per session. A typical complex session might include three or four source documents covering different aspects of the form: one for identity information, one for financial data, one for prior history, and a typed addition for instructions or supplementary details.
Each file is processed independently and added to the session source pool. The AI draws from all files simultaneously when filling each field group.
What exactly should I type in the text input field?
Anything relevant to filling the form, in plain English. Three types of content work well:
Data you know that isn't in any file: Names, numbers, dates, statuses you know from memory or from a conversation that wasn't documented. Write it as naturally as you'd tell a colleague: "Her date of birth is June 3, 1990. She's a US citizen. No prior bankruptcies."
Instructions for how to fill specific fields: Overrides, preferences, or special handling for edge cases: "Use the P.O. Box for mailing address, not the street address. Leave the supervisor field blank. If the form asks for a fax number, use the main office fax: 617-555-0100."
Corrections to what's in the files: If a file contains outdated or wrong information for a specific field: "The phone number in the attached PDF is old. The current number is 617-555-0199."
You don't need to separate these into categories or use any special format — write what you need and the AI handles the rest.
How does the AI tell the difference between data and instructions in the text input?
It uses the same contextual understanding that powers the rest of the AI filling pipeline — semantic analysis of the text relative to the form fields. A sentence like "The client's name is Maria Vasquez" is clearly a data value. A sentence like "Leave Section 4 blank" is clearly an instruction. Mixed or ambiguous phrasing is handled by the AI applying the content wherever it is most relevant.
You don't need to help the AI distinguish between the two. Write naturally. If you want to be explicit for clarity, you can structure your text with headers like "Additional data:" and "Instructions:" — but this is a personal preference, not a requirement.
Can I update or replace sources after a session has started?
Yes — sources can be updated via POST /api/session/{sessionId}/apply-sources. Upload new files or submit new text input, and the session source pool updates. After updating sources, re-run autofill to fill fields using the updated source set.
This is useful when you realize after the first fill that a source document was missing, or when you want to try filling with a different set of source documents to compare results.
How do profile sources interact with session attachments and text input?
When you select a profile for a session, the profile's stored files and text entries are added to the session source pool at creation — they join the same pool as your session-specific attachments and text input.
Profile sources are useful for information that doesn't change across sessions: company contact details, standard organizational language, frequently referenced reference documents. Session attachments handle what's specific to the current fill: the individual's documents, the current-year form, the case-specific records. Both contribute to every field fill without any configuration beyond selecting the profile.