Behind the Scenes at GSA’s Hackathon
The General Services Administration is setting its sights on artificial intelligence (AI) to make the digital experience easier and more efficient for citizens. That’s why it challenged innovators and coders to figure out how to optimize federal websites for large language models (LLM) in its Federal AI Hackathon.
LLMs can read and interpret websites and provide reliable answers for federal services. Gone are the days when a simple search engine was enough—now, people will expect more intuitive, conversational responses powered by AI.
GSA recognized that many federal websites and their data aren’t fully prepared to support LLMs. So, in the hackathon, teams were assigned one of five federal sites—Data.gov, Nsf.gov/award search, Ask.usda.gov, Data.census.gov, or Eia.gov—and given AI tools like generative AI on AWS, Azure OpenAI, Google Cloud, ChatGPT, GroqCloud, Command R, and Slack.
The goal was to enhance these federal websites using code, development standards, and advanced AI features to help LLMs provide clearer, more useful information to citizens.
Why Optimize LLMs Now?
GSA’s push to elevate the website user experience with LLMs matches with how people search for information today. More and more, people are turning to chatbots, virtual assistants like Siri and Alexa, and tools like ChatGPT. While not all of these are fully powered by LLMs yet, that’s changing fast.
Federal websites can’t afford to fall behind. Without optimizing for LLMs, they risk not only frustrating users but also providing incorrect or misleading information.
Right now, LLMs do their best with the data they have, but when they lack a clear answer, they might guess or even make one up. These AI responses can sound convincing, but users may not realize they need to double-check the information. These discrepancies vary significantly across chatbot and LLM providers where accurate responses may be generated by one but not others.
That’s why initiatives like GSA’s hackathon are so important. By optimizing LLMs now, GSA is working to stop the spread of unreliable information and make sure people can quickly and accurately get the government info they need.
Reimagining Federal Sites for LLM Optimization
At the hackathon, REI’s team was tasked with redesigning Eia.gov, which tracks U.S. energy data like gasoline prices and energy consumption across regions. The challenge is that the site is packed with tabular data, and when you feed that into LLMs as just HTML, the context can get muddled.
If someone asked about the highest or lowest gas prices in a region, the site might struggle to answer those types of questions accurately. That’s where our solution came in. Instead of overhauling the content or forcing a confusing structure on users, we created a separate API endpoint that feeds the data in a structured way, making it much more accurate.
These API endpoints act as a direct link between LLMs and the website, allowing the LLM to either scan the site using HTML or pull clean, structured data through the API.
To build this API, we used a knowledge graph approach, which works better than the traditional retrieval augmented generation (RAG) method. RAG scans all the text from a website and stores it in a database, so when someone asks a question, the LLM searches for the most relevant context. However, this approach often struggles with accuracy as unstructured data can lead to errors.
A knowledge graph, on the other hand, maps out relationships between unstructured data, allowing the LLM to capture long-range connections and provide contextually accurate answers.
When users ask questions that aren’t inherently in the data, knowledge graph-based LLMs built on structured data still manage to generate highly accurate answers or admit they don’t know the answer rather than providing a seemingly correct but false answer.
Changing the Future of Federal Web Applications
At the hackathon, teams explored various ways to optimize federal websites for LLMs. Our team chose knowledge graphs because, while they’re more challenging to build and implement, they drastically improve accuracy—and that’s the key to a better user experience in general and specifically in the public sector arena. They’re also highly effective for chat-based search and information retrieval.
These improvements go beyond just making federal websites better. For instance, if you asked ChatGPT for the phone number to inquire about social security benefits, the LLM wouldn’t automatically know. It would need to scan the relevant site, which often lists multiple phone numbers across different pages. The result? The LLM might give you the wrong one.
As more people rely on AI, there’s also the danger of bad actors inserting false information. That’s why accuracy and context matter. AI needs to be trustworthy, and structuring data properly is key to making LLMs reliable.
A Path Forward for the Public Sector
Citizens have already begun using AI powered search engines to facilitate their explorations or information finding missions. Agencies should consider ‘AI as a User’ of their websites.
To do this, agencies should consider adding API endpoints for LLM integration on their public-facing sites, and prioritize use cases where this interaction is the most impactful. After the design, testing and launch of the added API features, agencies should collect user feedback and continue to iterate based on user insights and performance monitoring.
REI is here to help. Our hackathon approach was praised as “innovative” and “impressive” by the EIA judges, but it’s just one way to bolster site performance and user experience.
Contact REI Systems at info@reisystems.com today to help you enhance your website for the future of AI.