export const tcoBlog = `
<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="content-type">
  </head>
  <body class="c19 doc-content">
    <p class="c4">
      <span class="c0">As companies build extensive data systems to support modern applications, data management costs have become a significant concern. Efforts to build AI - whether experimental or in production - adds further strain with the need for immediate access to textual, image, video, or other modalities of data, making data storage and extraction costlier.</span>
    </p>
    <p class="c4">
      <span class="c0">However, beyond storage costs, labor costs for high-skilled AI engineers are also a large and growing line item. Creating complex data systems that integrate various applications takes significant time and effort, resulting in a tangled web of data transfers. This complexity makes setup, maintenance, and error resolution challenging and time-consuming.</span>
    </p>
    <p class="c4">
      <span>In this post, we explore total cost of ownership when it comes to a database for AI applications. This includes hard and soft costs. We&rsquo;ll also explore how purpose-built databases that consolidate storage costs </span>
      <span class="c10">and</span>
      <span class="c0">&nbsp;operational overhead under a single hood, can save companies money on both categories, particularly labor. In fact, under extremes, a purpose-built database can reduce costs by 10x.</span>
    </p>
    <h1 class="c5" id="h.532xu85bnbao">
      <span class="c15">The Hard Costs</span>
    </h1>
    <p class="c4">
      <span class="c0">Hard costs form the foundation of database expense discussions. A typical data backend might use various tools:</span>
    </p>
    <p class="imgContainer">
			<span style="overflow: hidden; display: inline-block; margin: 0px; border: none; width: 100%; height: auto; box-sizing: border-box;">
        <img alt="" src="https://aperturedata-public.s3.us-west-2.amazonaws.com/website_images/tco_blog/tooling.png" style="width: 60%; height: 60%; margin-left: 0.00px; margin-top: 0.00px; transform: rotate(0.00rad) translateZ(0px); -webkit-transform: rotate(0.00rad) translateZ(0px);" title="">
      </span>
    </p>
    <p class="c4">
      <span class="c0">Using some combination of these tools often generates four different types of hard costs:</span>
    </p>
    <ol class="c3 lst-kix_1siut9h6d1p6-0 start" start="1">
      <li class="c4 c9 li-bullet-0">
        <span class="c2">Barrier Costs</span>
        <span class="c0">: Upfront or subscription fees for proprietary products.</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c2">Storage Costs</span>
        <span class="c0">: Charges per byte stored, higher for standard databases and lower for optimized data stores.</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c2">CPU Costs</span>
        <span class="c0">: Fees for CPU time used in data processing, sometimes for meeting format expectations that vary for the various data tools</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c2">Network Costs</span>
        <span class="c0">: Costs for data transfer between nodes, varying for ingress and egress rates.</span>
      </li>
    </ol>
    <p class="c4">
      <span>Combined, these costs really add up. One way to think about it is that each byte ingested into the storage system carries multiple micro-costs. This is </span>
      <span class="c16">
        <a class="c1" href="/blog/challenges-with-multimodal-data-evolving">magnified when building AI pipelines</a>
      </span>
      <span class="c0">&nbsp;as it can sometimes lead to copies of the same data in order to saturate large scale model training, local copies for experimentation to avoid frequently accessing this web of data systems, or a collection of overlapping datasets as data scientists search for representative data for better AI outcome. This invariably leads to redundant and unnecessary expenses.</span>
    </p>
    <h1 class="c5" id="h.ux332gh9qmqc">
      <span class="c15">The Soft Costs</span>
    </h1>
    <p class="c4">
      <span class="c0">The significant expense, however, lies in the engineering labor required to set up and maintain these systems. Engineers need to configure databases, connect nodes, write transformation code, manage security, and handle ETL processes. Maintenance involves updating schemas, applying security patches, potentially mismatching API updates, and ensuring data consistency across systems. Training new engineers and managing permissions add further complexity and time costs.</span>
    </p>
    <p class="c4">
      <span class="c0">Human errors add another layer of cost, potentially leading to data loss or compliance violations. When data scientists and ML engineers are forced to deal with such traditional data systems, dataset preparation instead of analysis can consume most of their time. If models don&rsquo;t perform as expected, tracking down possible error in the data web potentially from incorrect conversions or mismatched data types is nearly impossible or can consume a lot of precious team hours. Any disruption in this do-it-yourself spaghetti solution can require filing tickets with multiple vendors navigating through their variety of support systems.</span>
    </p>
    <h1 class="c5" id="h.v1j0gdhvxy1t">
      <span class="c15">The Opportunity Cost</span>
    </h1>
    <p class="c4">
      <span>Even with improving off-the-shelf traditional and generative AI models, desired AI outcomes to enable AI use cases in production still require access to proprietary data. While traditional applications could be implemented with siloed data organizations, AI applications require access to data from multiple sources and modalities to achieve near-human outcomes. Complex data architectures delay their implementation sometimes as long as 6-12 months. Because it&rsquo;s harder to query and visualize the underlying data used to train AI applications or surface the right responses when working with Gen AI applications, it can make it harder for data science and ML teams to achieve the desired level of accuracy and performance for their models. This difficulty in visualizing the data is somewhat intentional as a </span>
      <span class="c16">
        <a class="c1" href="https://www.linkedin.com/pulse/aperturedata-problem-month-accessing-meaty-sensitive-data-xkhnc/">way to</a>
      </span>
      <span class="c16">
        <a class="c1" href="https://www.linkedin.com/pulse/aperturedata-problem-month-accessing-meaty-sensitive-data-xkhnc/">&nbsp;protect PII</a>
      </span>
      <span class="c0">, but the downside is that it makes it harder to tweak AI models, which can often result in go-to-market delays. These delays amount to a real cost if another competitor is more agile at implementing AI, beating a company to market.</span>
    </p>
    <h1 class="c5" id="h.vdm3ih3x2lre">
      <span class="c15">Enter a Purpose-Driven Database</span>
    </h1>
    <p class="c4">
      <span class="c0">We built ApertureDB, a purpose-built database for multimodal data, to address these challenges and reduce costs.</span>
    </p>
    <h2 class="c11" id="h.lkb89dz4hgvi">
      <span class="c7">Hard Cost Savings</span>
    </h2>
    <ul class="c3 lst-kix_fgvgfg1fssfp-0 start">
      <li class="c4 c9 li-bullet-0">
        <span class="c2">Efficient tech spend</span>
        <span class="c0">: Instead of splitting storage of small units (text, often stored in a Postgres-like solution, sometimes multiple to suit the data types) from larger units (images or video, often stored in an S3-like solution), ApertureDB consolidates data types behind a unified API. Reduced reliance on multiple third-party applications lowers subscription costs since you only need one solution for all of your complex multimodal processing needs. </span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c2">Resource efficiency:</span>
        <span class="c0">&nbsp;With the ability to preprocess or augment data on the fly, where often its downsized for ML use cases, this means less network costs, lower unit costs (less duplication), and simplified compliance.</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c2">Resource Utilization</span>
        <span class="c0">: Finally, there is the savings on bare-metal efficiency. By processing data rapidly on the fly and in parallel via a purpose-built database, GPUs are kept busy and saturated, maximizing utility. &nbsp;</span>
      </li>
    </ul>
    <h2 class="c11" id="h.cdhag57cst41">
      <span class="c7">Soft Cost Savings</span>
    </h2>
    <ul class="c3 lst-kix_gyj5yjs05l5e-0 start">
      <li class="c4 c9 li-bullet-0">
        <span class="c2">Simplified Management</span>
        <span>: One database means fewer connections, less setup, and fewer maintenance tasks. There is only </span>
        <span class="c10">one</span>
        <span class="c0">&nbsp;database to learn, significantly less connections to set-up, and a lot fewer work-around scripts.</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c2">Reduced Errors and Data Loss</span>
        <span>: Lower risk of duplication and errors, minimizing time spent on corrections. If you ask data teams what they </span>
        <span class="c10">waste their time on</span>
        <span class="c0">, it&rsquo;s handling one-off requests of data getting incorrectly duplicated or destroyed. With a single place of storage, this is no longer a huge issue. </span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c2">Ease of Use</span>
        <span>: Streamlined training and permission management save time and reduce complexity. There is also only </span>
        <span class="c10">one</span>
        <span>&nbsp;database to issue updates and patches to, </span>
        <span class="c10">one</span>
        <span>&nbsp;database to train new employees around, </span>
        <span class="c10">one</span>
        <span>&nbsp;database to grant access to, and </span>
        <span class="c10">one</span>
        <span class="c0">&nbsp;database to monitor for any errors. </span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c2">Focused Effort</span>
        <span class="c0">: With the ability to index, query, and easily visualize data using its blended vector search and graph filtering, data science and ML teams can easily prepare and understand their datasets. Data teams can focus on optimizing AI accuracy and performance rather than managing a complex data system.</span>
      </li>
    </ul>
    <h2 class="c11" id="h.hjhwo7gr94h1">
      <span class="c7">Opportunity Cost Elimination</span>
    </h2>
    <p class="c4">
      <span class="c0">A purpose-built database accelerates AI implementation by simplifying data integration and access. For instance, running multimodal AI analysis is dramatically easier with databases like ApertureDB because it consolidates data; otherwise, analysis needs to be conducted independently and then crudely collapsed into a single query script.</span>
    </p>
    <p class="c4">
      <span class="c0">The simplified data access with ApertureDB enables faster deployment of AI applications, giving companies a competitive edge.</span>
    </p>
    <h1 class="c5" id="h.ay5ynwm7q3pf">
      <span class="c15">Real-World Evidence</span>
    </h1>
    <p class="c4">
      <span>We&rsquo;ve thus far provided a very theoretical argument to why a purpose-built database saves on costs. However, we also have empirical evidence to support that claim! We&rsquo;ve observed teams deploy ApertureDB, a multimodal database, into production. We&rsquo;ve paid particular attention to their net savings. Let&rsquo;s visit the migration experiences of two </span>
      <span class="c10">very</span>
      <span class="c0">&nbsp;different companies.</span>
    </p>
    <h2 class="c5" id="h.guyh9jftax1o">
      <span class="c7">A Fortune 100 big box retailer</span>
    </h2>
    <p class="c4">
      <span class="c0">First is a Fortune 500 company that featured a massive online storefront with a 5-10 person visual AI team at their disposal. The team&rsquo;s job was to understand why consumers weren&rsquo;t seeing proper recommendations when browsing products on the online storefront.</span>
    </p>
    <p class="c4">
      <span class="c0">They discovered a few subproblems:</span>
    </p>
    <ol class="c3 lst-kix_uhum9b960d1m-0 start" start="1">
      <li class="c4 c9 li-bullet-0">
        <span class="c0">It was difficult to obtain permissions to read the correct product and asset management tables, as Fortune 500s are naturally protective about data.</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c0">Simple tasks would often take hours (and sometimes days) because data was located in various places.</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c0">Data science teams would download data once and continuing using it without refreshing to avoid the delays</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c0">Embeddings were extracted by various teammates but not consolidated resulting in varying recommendations and making it very difficult to debug if the recommendations were not as expected.</span>
      </li>
    </ol>
    <p class="c4">
      <span>Apart from the hard cost of local dataset copies and wasted processing efforts in extracting embeddings repeatedly without a way to collaborate, these resulted in soft costs to the data team productivity by creating day-to-day headaches and a huge opportunity cost in lost revenue (hard to quantify revenue gains from models that were delayed or not deployed) due to a disconnected </span>
      <span>system</span>
      <span class="c0">.</span>
    </p>
    <p class="c4">
      <span class="c0">Moving to ApertureDB did require setting up ETL from various systems but needed to be automated once. However, the biggest benefits of the move to ApertureDB were solutions to the prior problems:</span>
    </p>
    <ol class="c3 lst-kix_ecp0jukko10o-0 start" start="1">
      <li class="c4 c9 li-bullet-0">
        <span>Permissions were simplified to just managing RBAC for </span>
        <span class="c10">one</span>
        <span class="c0">&nbsp;system</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span>Previously tedious tasks of collecting data from multiple tools translated to writing simple queries to one system, reducing the time needed to </span>
        <span>under </span>
        <span class="c0">an hour</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c0">Data teams no longer needed to create local copies since they could get pre-processed, regularly refreshed data on-the-fly in the format expected</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span>Embeddings (even from different models for comparisons) could be indexed in </span>
        <span class="c10">one</span>
        <span class="c0">&nbsp;location, together with the corresponding images, annotations, and metadata. </span>
      </li>
    </ol>
    <p class="c4">
      <span class="c0">Now with fewer hiccups, the team was able to turn their attention to improving their models&rsquo; efficacy. The final result? Customers received better recommendations leading to better revenue potential.</span>
    </p>
    <h2 class="c5" id="h.o992y5xa5r45">
      <span class="c7">Badger Technologies</span>
    </h2>
    <p class="c4">
      <span>Badger Technologies (Badger) is a retail </span>
      <span>robotics </span>
      <span class="c0">company that enables retailers to monitor product placements via a motorized, tower-like robot. Badger robots pass through aisles and ingests data from an array of cameras and use their onboard computer vision models to convert those images to an array of embeddings. They then need high throughput vector classification to identify the products and match them to their locations and pricing, generating alerts in case of mismatches. On the model preparation side, Badger team also needed a way to store images for store products, label them and train the models to deploy on their robots.</span>
    </p>
    <p class="imgContainer">
			<span style="overflow: hidden; display: inline-block; margin: 0px; border: none; width: 100%; height: auto; box-sizing: border-box;">
        <img alt="" src="https://aperturedata-public.s3.us-west-2.amazonaws.com/website_images/tco_blog/badger_arch.png" style="width: 60%; height: 60%; margin-left: 0.00px; margin-top: 0.00px; transform: rotate(0.00rad) translateZ(0px); -webkit-transform: rotate(0.00rad) translateZ(0px);" title="">
      </span>
    </p>
    <p class="c4">
      <span class="c0">This created a multimodal data problem. Specifically:</span>
    </p>
    <ol class="c3 lst-kix_sj58zuao4ivg-0 start" start="1">
      <li class="c4 c9 li-bullet-0">
        <span class="c0">On the store side, Badger needed thousands of vector classifications performed per second from a large number of stores, reliably</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c0">On the model preparation side, Badger needed a unified location to store their image data, speed up labeling tasks, and simplify dataset preparation for model training.</span>
      </li>
    </ol>
    <p class="c4">
      <span>As described in </span>
      <span class="c16">
        <a class="c1" href="/case-studies/1">this case study</a>
      </span>
      <span class="c0">, their previous vector database solutions could not keep up with the scale causing a slowdown in their growth, leading to revenue loss. With a complex manual solution for data labeling where data was scattered in various file storage locations, data labeling was often delayed by months causing a lot of wasted people-time. &nbsp; </span>
    </p>
    <p class="c4">
      <span class="c0">After implementing ApertureDB, Badger was able to improve its store performance by more than 2.5x. Today, Badger is able to handle a whopping 10,000 queries per second without any instability. This is possible because of ApertureDB&rsquo;s scalable multimodal vector database. With ApertureDB&rsquo;s ML integrations with labeling and training frameworks, Badger team got the opportunity to unify their image and label storage into one database, which meant significant simplification of their data collection and model training pipelines.</span>
    </p>
    <h2 class="c4 c18" id="h.dedi5f70yuan">
      <span class="c7">Dataset Preparation Speedup for Fast-Growing Startups</span>
    </h2>
    <p class="c4">
      <span class="c0">We saw similar video labeling, training and inference examples with a frictionless checkout company, Zippin, and a Biotech stealth company. They were both dealing with large collections of videos in storage buckets, either labeled or requiring an integration with a labeling pipeline, needing to extract interesting frames, and keep costs down all while growing rapidly. Use of ApertureDB has made it possible for them to get their data requirements met in half the time and half the amount of resources, both extremely crucial benefits for startups. &nbsp;</span>
    </p>
    <h2 class="c4 c18" id="h.b2pb3enhl17p">
      <span class="c7">ApertureDB RAG-based Documentation Chatbot</span>
    </h2>
    <p class="c4">
      <span>ApertureDB team wanted to </span>
      <span>experiment </span>
      <span>with our vector search (and knowledge graph capabilities in the coming days) to build a query-response chatbot to help our users. With our LangChain RAG implementation, it took us less than a week to crawl our website and documentation pages, index the necessary embeddings, &nbsp;and add a </span>
      <span class="c16">
        <a class="c1" href="http://docs.aperturedata.io">plug-in our documentation page</a>
      </span>
      <span class="c0">&nbsp;(alpha version right now). There are of course more improvements that can be done by experimenting with various LLMs, token sizes, and knowledge graph constructions, but data management did not slow us down. We are happy with the improved user experience and are now able to improve our documentation if user queries indicate missing pages (a new blog coming soon).</span>
    </p>
    <p class="c4">
      <p class="imgContainer">
			<span style="overflow: hidden; display: inline-block; margin: 0px; border: none; width: 100%; height: auto; box-sizing: border-box;">
        <img alt="" src="https://aperturedata-public.s3.us-west-2.amazonaws.com/website_images/tco_blog/docs_chat.png" style="width: 60%; height: 60%; margin-left: 0.00px; margin-top: 0.00px; transform: rotate(0.00rad) translateZ(0px); -webkit-transform: rotate(0.00rad) translateZ(0px);" title="">
      </span>
    </p>
    </p>
    <h1 class="c5" id="h.4e9zb9p24krs">
      <span class="c15">Why Go Purpose-Built?</span>
    </h1>
    <p class="c4">
      <span class="c0">Purpose-built databases offer significant savings in both hard and soft costs, as well as eliminate the opportunity costs associated with delayed AI implementation. They provide:</span>
    </p>
    <ul class="c3 lst-kix_w6x7juoeu5e0-0 start">
      <li class="c4 c9 li-bullet-0">
        <span class="c0">Lower storage costs</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c0">Reduced reliance on multiple products</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c0">Simplified infrastructure management</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c0">Minimized error-related disruptions</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c0">Streamlined training and permission management</span>
      </li>
      <li class="c4 c9 li-bullet-0">
        <span class="c0">Faster AI deployment</span>
      </li>
    </ul>
    <p class="c8">
      <span>To learn more about how a purpose-built database for multimodal analytics can save costs, </span>
      <span class="c16">
        <a class="c1" href="/contact-us">book a demo</a>
      </span>
      <span>here. We have built an industry-leading database for multimodal AI to future-proof data pipelines as multimodal AI methods evolve. Stay informed about our journey by subscribing </span>
      <span class="c16">
        <a class="c1" href="https://docs.google.com/forms/d/e/1FAIpQLSdl05L10a-AUuf0qGV0jD3SU3u2JMH_4I6tn_aAxmjaGI2ppw/viewform">to our blog.</a>
      </span>
    </p>
    <p class="c8 c17">
      <span class="c0"></span>
    </p>
    <p class="c8 c17">
      <span class="c0"></span>
    </p>
    <p class="c8">
      <span class="c10 c13"><i>I want to acknowledge the insights and valuable edits from Mathew Pregasen and JJ Nguyen.</i></span>
    </p>
    <p class="c4 c17">
      <span class="c0"></span>
    </p>
  </body>
</html>
`;
