ArangoDB https://arangodb.com/ The database for graph and beyond Thu, 14 Dec 2023 11:59:05 +0000 en-US hourly 1 https://wordpress.org/?v=6.3.2 https://arangodb.com/wp-content/uploads/2023/08/cropped-favicon-32x32.png ArangoDB https://arangodb.com/ 32 32 Introducing ArangoDB’s Data Loader : Revolutionizing Your Data Migration Experience https://arangodb.com/2023/12/introducing-arangodbs-data-loader-revolutionizing-your-data-migration-experience/ https://arangodb.com/2023/12/introducing-arangodbs-data-loader-revolutionizing-your-data-migration-experience/#respond Thu, 14 Dec 2023 11:57:57 +0000 https://arangodb.com/?p=47375 At ArangoDB, our commitment to empowering companies, developers, and data enthusiastswith cutting-edge tools and resources remains unwavering. Today, we’re thrilled to unveil ourlatest innovation, the Data Loader, a game-changing feature designed to simplify and streamlinethe migration of relational databases to ArangoGraph. Let’s dive into what makes Data Loader amust-have tool for your data migration needs.…

The post Introducing ArangoDB’s Data Loader : Revolutionizing Your Data Migration Experience appeared first on ArangoDB.

]]>

Estimated reading time: 7 minutes

At ArangoDB, our commitment to empowering companies, developers, and data enthusiasts
with cutting-edge tools and resources remains unwavering. Today, we’re thrilled to unveil our
latest innovation, the Data Loader, a game-changing feature designed to simplify and streamline
the migration of relational databases to ArangoGraph. Let’s dive into what makes Data Loader a
must-have tool for your data migration needs.

Say Goodbye to Complex Relational Database Structures

If you’ve ever grappled with the intricate relationships of old-school relational databases, where
tables connect to other tables, and a third table acts as the linchpin, you understand the
challenge of managing data in this convoluted setup. It’s like solving a puzzle – finding
connections, managing dependencies, and ensuring data integrity, a time-consuming process
prone to errors.

Data Loader is your go-to solution for effortlessly migrating relational databases to
ArangoGraph. This powerful tool simplifies the process by allowing you to define edge relations
between nodes with a simple drag-and-drop of raw CSV files. Whether you’re a seasoned
developer or just starting your journey with ArangoDB, Data Loader provides an intuitive and
user-friendly interface, making the migration process a breeze.

You could go from the old-school, less intuitive approach like this:

img1Asset 1

To a modern and intuitive approach like this:

img2Asset 2

Say goodbye to the days of dealing with convoluted relational structures. Data Loader welcomes
you to a new era of data management simplicity. It’s not just a tool; it’s a revolution in how you
approach and manage your data relationships.

The Significance of Data Loader

The true significance of Data Loader lies in its ability to transform your existing relational tables
into a graph full of relationships. By leveraging this tool, you can harness the power of
ArangoGraph without the complexities of manual data migration. It’s a bridge that effortlessly
connects your relational data to the world of graph databases.

Our Clear Goal: Making Data Migration Comfortable and Intuitive

Our goal with Data Loader is crystal clear – to make migrating to a graph-type model
comfortable and intuitive. Whether your use case is simple or complex, Data Loader empowers
you to transition from relational databases to ArangoGraph, unlocking the full potential of
graph-based data modeling.

The Future of Data Loader

As we continue to evolve, Data Loader is just the beginning. We’re excited to announce that, in
the future, we’ll be expanding its capabilities to support additional file formats, including TSV
and JSON files. This ensures that Data Loader remains a versatile and comprehensive solution
for all your data migration needs.

Get the Most Out of Your Data with ArangoDB Solutions

Get ready to supercharge your data migration experience with the power of ArangoDB solutions
you already know. Whether you’re in Cyber/Threat Management, Fraud Detection, Supply Chain,
or any other use case, ArangoDB offers tailored solutions to meet your specific needs.

Explore the possibilities with our Solution Accelerators, including Enterprise Knowledge Graphs,
Entity Resolution, Traceability/Lineage, and Contextual Relevance. ArangoDB’s differentiators,
such as Model Flexibility, Graph Scalability, Performance at Scale, and Unified Query Language,
set us apart from other databases.

Start Using Data Loader Today

Use the Data Loader as much as possible and experience the power of graphs on data you’re
already familiar with. With Data Loader, the transition to a graph-based model is not just
efficient—it’s a game-changer for your data management strategy.

Explore the capabilities of Data Loader and witness firsthand the impact on your insights. Refer
to our comprehensive Data Loader documentation to start your data migration journey.
Welcome to a new era of seamless data migration with ArangoDB.

Now, let’s delve into a hands-on tutorial to guide you through the process. This tutorial leverages
a sample dataset of two files: ‘airports.csv’ and ‘flights.csv.’ These files will allow us to craft a
graph that showcases flights arriving and departing from diverse cities worldwide.

Let’s break down the process into easily manageable steps:

  1. Database and Graph Setup: Begin by naming your database and graph you would like
    to use for the data import.
  2. Upload files: Use the intuitive Data Loader web interface and simply drag & drop your
    CSV files or upload them through the file browser window.
  3. Design graph: Add nodes and edges and map data from the uploaded files to them.
    This will allow you to create your graph’s corresponding documents and collections.
  4. Import data: Finally, import the data and start using your newly created graph and
    collections.

Eager to see it in action?

Let’s start by creating a new database and adding a name for our graph.

img3Asset 1

Now, let’s upload the files. You can drag and drop or upload them via a file browser window.

img4Asset 1

Once the files are in, we can design a graph schema. This tutorial will be a simple graph
consisting of two nodes (“origin_airport” and “destination_airport”) and a directed edge going
from the origin airport to the destination one representing a flight. Click Add node to create
nodes and connect them with edges.

Next, for each of the nodes and edges, we create a mapping to the corresponding file and
headers.

For nodes, the node label will be a node collection name, and the primary identifier will be used
to populate the _key attribute of documents. We can also select any additional headers to be
included as document attributes. In this example, we’re creating two node collections,
“origin_airport” and “destination_airport.” The “AirportID” header will be used to create the _key
attribute for documents in both node collections. The header preview makes it easy to select the
headers you want to use.

img5Asset 2

For edges, the edge label is going to be an edge collection name. Then, we need to specify how
edges will connect nodes. We do this by selecting the from and to nodes to give a direction to
the edge. In this example, we select the “source airport” header as a source and the
“destination airport” as a target for the edge.

img6Asset 3 2

Note that the values of source and target for the edge correspond to the primary identifier (_key
attribute) of the nodes. In this example, the airport code (i.e., GKA) is used as the _key in the
node documents and in the source and destination headers to configure the edges.

After all the mapping is done, all we need to do is click Save and start import. The report
provides an overview of the files processed and documents created and a link to your new
graph.

Finally, click See your new graph to open the ArangoDB web interface and explore your new
collections and your new graph. Happy graphing!

img7Asset 1

The post Introducing ArangoDB’s Data Loader : Revolutionizing Your Data Migration Experience appeared first on ArangoDB.

]]>
https://arangodb.com/2023/12/introducing-arangodbs-data-loader-revolutionizing-your-data-migration-experience/feed/ 0
How ArangGraphML Leverages Intel’s PyG Optimizations https://arangodb.com/2023/12/how-aranggraphml-leverages-intels-pyg-optimizations/ https://arangodb.com/2023/12/how-aranggraphml-leverages-intels-pyg-optimizations/#respond Mon, 04 Dec 2023 04:52:01 +0000 https://arangodb.com/?p=47261 ArangoGraphML + Intel: Next-level Machine Learning Accelerated ArangoDB and Intel have announced a groundbreaking partnership to enhance Graph Machine Learning (GraphML) using Intel’s high-performance processors. This collaboration, part of the Intel Disruptor Program, will seek to integrate ArangoDB’s graph database solutions with Intel’s Xeon CPU. This synergy promises to revolutionize data analytics and pattern recognition…

The post How ArangGraphML Leverages Intel’s PyG Optimizations appeared first on ArangoDB.

]]>

Estimated reading time: 3 minutes

ArangoGraphML + Intel: Next-level Machine Learning Accelerated

ArangoDB and Intel have announced a groundbreaking partnership to enhance Graph Machine Learning (GraphML) using Intel’s high-performance processors. This collaboration, part of the Intel Disruptor Program, will seek to integrate ArangoDB’s graph database solutions with Intel’s Xeon CPU. This synergy promises to revolutionize data analytics and pattern recognition in complex graph structures, marking a new era in database technology and GraphML advancements.

ArangoGraphML

ArangoGraphML, part of ArangoDB’s suite, is an advanced graph machine learning platform designed for efficient data analysis and pattern recognition in complex graph structures, leveraging graph database technology to drive innovation in data intelligence and analytics.

Machine Learning Performance Challenge

The quest for speed in machine learning platforms is unending. By delving into Intel’s PyG optimizations, we aim to harness the power of CPU performance enhancements specifically tailored for Graph Neural Network and PyG workloads. As ArangoGraphML is leveraging PyG, any performance improvement is relevant for us and our customers. This exploration is not only about benchmarking Intel’s PyG optimizations but also about internal testing to measure their impact on our platform.

PyG benchmark

Our focus lies on gauging the performance of GraphML algorithms within our platform using torch.compile. This method allows us to assess the efficiency gains brought about by Intel’s PyG optimizations during the training and inference time, providing insights into the tangible benefits for our users.

Benchmark methodology

To ensure a robust evaluation, we conducted tests under controlled conditions:

  • System Specifications: We have used an AWS EC2 instance specifically t2.2xlarge with 8 vCPUs and 32 GiB RAM.
  • Dataset: We have used ogb-products dataset which is a large-scale undirected and unweighted graph, representing an Amazon product co-purchasing network. The task is to predict the category of a product in a multi-class classification setup, where the 47 top-level categories are used for target labels. This dataset highlights its relevance to real-world scenarios.
  • Batch Size, Hidden Layers, and Number of Layers: We have experimented with different essential hyper-parameters in evaluating the performance of GraphML algorithms.

The outcomes

In our preliminary assessments, we observed a noteworthy increase in performance, achieving a speedup of up to 20%. The gains were evident when comparing the execution times of GraphML algorithms with and without Intel’s PyG optimizations. The results are presented graphically in the chart below and summarized in the accompanying table.

chart

Batch SizeHidden
Channels
Layers ModeMedian Time
per Epoch (in seconds)
Speed up
10242562Eager153.803
10242562Compile134.106
1.15x
512642Eager89.039
512642Compile98.714
1.11x
5121283Eager
5121283Compile
1.12x

Conclusion

With a demonstrated performance boost, we are now leveraging Intel’s PyG optimizations across our platform. This commitment aligns with our dedication to providing users with cutting-edge technology and optimized algorithms for their Graph Neural Network workflows.

As the field of machine learning continues to evolve, ArangoGraphML remains at the forefront, leveraging Intel’s PyTorch Geometric optimizations to ensure our users experience the fastest and most efficient ML platform available.

Stay tuned for further updates on our journey toward excellence in Graph Machine Learning!

The post How ArangGraphML Leverages Intel’s PyG Optimizations appeared first on ArangoDB.

]]>
https://arangodb.com/2023/12/how-aranggraphml-leverages-intels-pyg-optimizations/feed/ 0
ArangoDB’s Exciting Updates: Introducing Our Developer Hub and GenAI Bots! https://arangodb.com/2023/10/arangodbs-exciting-updates-introducing-our-developer-hub-and-genai-bots/ https://arangodb.com/2023/10/arangodbs-exciting-updates-introducing-our-developer-hub-and-genai-bots/#respond Wed, 11 Oct 2023 07:01:05 +0000 https://arangodb.com/?p=46999 At ArangoDB, our commitment to empowering developers and data enthusiasts with cutting-edge tools and resources is unwavering. In line with our commitment to “Graph Done Simple,” we are thrilled to unveil two groundbreaking additions to our arsenal that promise to revolutionize your experience with our multi-model graph database. Developer Hub: Where Knowledge Meets Accessibility We’ve…

The post ArangoDB’s Exciting Updates: Introducing Our Developer Hub and GenAI Bots! appeared first on ArangoDB.

]]>

Estimated reading time: 3 minutes

At ArangoDB, our commitment to empowering developers and data enthusiasts with cutting-edge tools and resources is unwavering. In line with our commitment to “Graph Done Simple,” we are thrilled to unveil two groundbreaking additions to our arsenal that promise to revolutionize your experience with our multi-model graph database.

Developer Hub: Where Knowledge Meets Accessibility

We’ve always believed in the power of community-driven knowledge sharing, and we are proud to present our brand-new Developer Hub, accessible at developer.arangodb.com. This hub is a testament to our dedication to creating an ecosystem that empowers you with the knowledge and resources you need.

Here’s what you can expect:

Ungated ArangoDB Content

Your feedback matters to us; we’ve heard your requests loud and clear. We’re excited to announce that we’ve ungated most of ArangoDB University’s content! Now, you can access a treasure trove of tutorials, courses, and educational materials without barriers. We have more content coming, so keep your eyes on upcoming content. Elevate your skills, explore new concepts, and master the art of data management.

ArangoDB developer hub

User-Friendly Website and Tutorials

Learning should be a joy, not a struggle. That’s why we’ve updated our tutorials to be more user-friendly than ever before. Dive into practical, hands-on guidance that helps you easily tackle real-world challenges. Whether you’re a seasoned pro or just starting your journey with ArangoDB, our tutorials have something for everyone.

The Go-To Portal for Community and Technical Content

You asked, and we listened. We are consolidating our developer content to a central portal. Going forward, it will be the central portal for all community and technical content. Your journey with ArangoDB starts here!

Self-Service GenAI Bots: Inkeep and Kapa.ai

In our relentless pursuit of innovation, we’re thrilled to introduce two dynamic additions to our family – Inkeep  and Kapa.ai, our self-service GenAI bots. We are running a competition of Bot vs. Bot through November 2023. May the best Bot win!

inkeep
kapa.ai

Inkeep and Kapa.ai: Your ArangoDB Companions

Meet Inkeep and Kapa.ai, your personal ArangoDB companion available 24/7, ready to assist you. Whether you need to ask questions about complex queries, fine-tune database performance, or troubleshoot issues, Inkeep is your go-to expert. No more digging through manuals or forums; ask your questions in plain language, and Inkeep will provide instant answers and solutions. 

Start Exploring Today

These remarkable additions to the ArangoDB ecosystem are designed to make your development and data management journeys more accessible, efficient, and insightful than ever before. Whether you’re an expert seeking answers to complex development, operations, or optimization questions or just getting started, we invite you to explore our new Developer Hub and harness the potential of our GenAI bots.

Visit developer.arangodb.com now to embark on your journey of knowledge and discovery. Engage with Inkeep (#inkeep-askme) and Kapa.ai (#kapa-ai-askme) in our community slack channel at arangodb-community.slack.com.

Thank you for being part of the ArangoDB community. We’re excited to witness the incredible things you’ll achieve with these new tools and resources.

Here’s to a future filled with boundless possibilities!

Warm regards,

The ArangoDB Team

The post ArangoDB’s Exciting Updates: Introducing Our Developer Hub and GenAI Bots! appeared first on ArangoDB.

]]>
https://arangodb.com/2023/10/arangodbs-exciting-updates-introducing-our-developer-hub-and-genai-bots/feed/ 0
Evolving ArangoDB’s Licensing Model for a Sustainable Future https://arangodb.com/2023/10/evolving-arangodbs-licensing-model-for-a-sustainable-future/ https://arangodb.com/2023/10/evolving-arangodbs-licensing-model-for-a-sustainable-future/#respond Wed, 11 Oct 2023 07:01:03 +0000 https://arangodb.com/?p=47009 ArangoDB as a company is firmly grounded in Open Source. The first commit was made in October 2011, and today, we are very proud of having over 13,000 stargazers on GitHub. We believe that the ArangoDB community should be able to enjoy all of the benefits of using ArangoDB, and we have always offered a…

The post Evolving ArangoDB’s Licensing Model for a Sustainable Future appeared first on ArangoDB.

]]>

Estimated reading time: 3 minutes

ArangoDB as a company is firmly grounded in Open Source. The first commit was made in October 2011, and today, we are very proud of having over 13,000 stargazers on GitHub. We believe that the ArangoDB community should be able to enjoy all of the benefits of using ArangoDB, and we have always offered a completely free community edition in addition to our paid enterprise offering.

With the evolving landscape of database technologies and the imperative need to ensure ArangoDB remains sustainable, innovative, and competitive, we’re introducing some changes to our licensing model. These alterations will help us continue our commitment to the community, fuel further development, and assist businesses in obtaining the best from our platform.
These alterations are based on changes in the broader database market.

Firstly, the source code will replace its existing Apache 2.0 license with the BSL 1.1 for future versions. This license will allow full usage of the ArangoDB source code for any purpose except for providing a managed service of ArangoDB. These changes will not impact 99.99% of those currently using the ArangoDB source code but will protect ArangoDB against larger companies from providing a competing service using our source code. After four years, this will revert to the Apache 2.0 license.

Secondly, we are making changes to our community edition with the prepackaged ArangoDB binaries available for free on our website. Where before this was governed by the same Apache 2.0 license as the source code, this will now be governed by a new ArangoDB Community License, which limits its use for commercial purposes and imposes a 100GB limit on dataset size within a single cluster. These changes allow us to enhance Community Edition with many of the features currently only available in Enterprise Edition, such as OneShard graphs providing built-in HA, HotBackups providing improved disaster recovery, and many security improvements while protecting our business interests.

While not all features will be available in the community edition immediately, our intention is for Community Edition to have the same feature set as Enterprise Edition, the only difference being the license governing each edition.

Our Enterprise Edition will continue to be governed by the existing ArangoDB Enterprise License, with commercial terms negotiated on a case-by-case basis with ArangoDB.

What should community users do?

The actual license changes will roll out over the next month or so, and there will be no immediate impact on any individuals; however, once the license changes are fully applied, there will be a few impacts:

  • If you are distributing ArangoDB’s open source code, or Community Edition as a managed service, you will be unable to use future versions of ArangoDB. Again, we expect the change to the BSL to impact less than 0.01% of our community.
  • If you are using Community Edition for any commercial purpose (including distributing to your end users), then you will not be able to upgrade to versions of ArangoDB governed by the ArangoDB Community License (the upcoming ArangoDB 3.12 release and onwards)
  • If you are using Community Edition and storing in aggregate of 100GB of data in a single cluster, then you will not be able to upgrade to versions of ArangoDB governed by the ArangoDB Community License (the upcoming ArangoDB 3.12 release and onwards)

If any of these apply to you and you want to avoid future disruption, we encourage you to contact us so that we can work with you to find an appropriate solution.

In summary:

  • Our commitment to open-source ideals remains unshaken. Adjusting our model is essential to ensure ArangoDB’s longevity and to provide you with the cutting-edge features you expect from us.
  • These changes testify to our belief in offering even more power to our community users, bringing Enterprise-level features to the Community Edition.
  • We continue to uphold our vision of an inclusive, collaborative, and innovative community. This change ensures we can keep investing in our products and you, our valued community.

Edit – this blog was updated on 12th October to clarify which future versions of ArangoDB will be impacted by these license changes.

The post Evolving ArangoDB’s Licensing Model for a Sustainable Future appeared first on ArangoDB.

]]>
https://arangodb.com/2023/10/evolving-arangodbs-licensing-model-for-a-sustainable-future/feed/ 0
ArangoGraph Now Available on AWS Marketplace https://arangodb.com/2023/09/arangograph-now-available-on-aws-marketplace/ https://arangodb.com/2023/09/arangograph-now-available-on-aws-marketplace/#respond Mon, 11 Sep 2023 10:24:00 +0000 https://usarangodb.wpengine.com/?p=46878 Today we are excited to announce that ArangoGraph, the ArangoDB Managed Service, is available for purchase in the AWS Marketplace. With this announcement, ArangoGraph can now be purchased directly via both AWS and GCP. The AWS Marketplace provides an extensive catalog of software solutions for users to easily explore, test, buy, and deploy on AWS.…

The post ArangoGraph Now Available on AWS Marketplace appeared first on ArangoDB.

]]>

Estimated reading time: 1 minute

Today we are excited to announce that ArangoGraph, the ArangoDB Managed Service, is available for purchase in the AWS Marketplace. With this announcement, ArangoGraph can now be purchased directly via both AWS and GCP.

The AWS Marketplace provides an extensive catalog of software solutions for users to easily explore, test, buy, and deploy on AWS. If you’re an AWS customer, here’s what this announcement means for you:

  • Reduced procurement time: AWS takes care of the ArangoGraph subscription so you can expedite the procurement process and get building quicker.
  • Consolidated spend and billing: Purchase ArangoGraph through the marketplace to easily add to existing spend and meet committed spend levels faster. AWS will handle billing and payments so you can view all of your AWS spend in one place, with ArangoGraph as a line item on your bill.

You can visit our marketplace listing to get started. The public listing allows you to buy an A32 OneShard deployment in AWS us-east-1, which is our recommendation for most standard workloads. Other deployment types and regions are available for purchase in the AWS marketplace via private offers. Contact us at cloud-sales@arangodb.com to discuss your private offer needs further.

The post ArangoGraph Now Available on AWS Marketplace appeared first on ArangoDB.

]]>
https://arangodb.com/2023/09/arangograph-now-available-on-aws-marketplace/feed/ 0
Bridging Knowledge and Language: ArangoDB Empowers Large Language Models for Real-World Applications https://arangodb.com/2023/08/bridging-knowledge-and-language-arangodb-empowers-large-language-models-for-real-world-applications/ https://arangodb.com/2023/08/bridging-knowledge-and-language-arangodb-empowers-large-language-models-for-real-world-applications/#respond Mon, 28 Aug 2023 19:43:48 +0000 https://www.arangodb.com/?p=42786 Understanding Large Language Models (LLMs) and Knowledge Graphs Today, two very different technology concepts have become prominent in data analysis and predictive analytics: Knowledge Graphs and Large Language Models (LLMs). These domains each have their unique benefits, and influence the ways that we engage with and derive meaningful insights from constantly expanding and complex datasets. …

The post Bridging Knowledge and Language: ArangoDB Empowers Large Language Models for Real-World Applications appeared first on ArangoDB.

]]>

Estimated reading time: 5 minutes

Understanding Large Language Models (LLMs) and Knowledge Graphs

Today, two very different technology concepts have become prominent in data analysis and predictive analytics: Knowledge Graphs and Large Language Models (LLMs). These domains each have their unique benefits, and influence the ways that we engage with and derive meaningful insights from constantly expanding and complex datasets.  They are like the Odd Couple – better together than on their own!

Their distinctiveness aside, let’s not forget that these approaches share a commonality—they both change our view of information analysis, interpretation, and application. Both are important to how we make sense of and harness data.

Large Language Models (LLMs): Unveiling Language Potential

Large Language Models, for example OpenAI’s ChatGPT, have become powerful language transformers. These models, with the help of advanced neural networks, possess the uncanny ability to understand, generate, and engage in contextually-aware conversations. LLMs craft coherent responses, generate insightful outputs, and perform many different text-based tasks. They excel at natural language understanding and generation capabilities; they navigate and interpret complex textual inputs almost too easily.  And, how they do this so fast is almost beyond belief.

Knowledge Graphs: Revealing Information Interconnections

On the other hand, Knowledge Graphs contain carefully structured data and are designed to capture intricate relationships among discrete and seemingly unrelated information. These graph-based structures organize data hierarchically while interconnecting data points and relationships. Knowledge Graphs are great at contextual insights, allowing users to explore and comprehend associations and dependencies among data fragments. They are very good at structured queries that show hidden yet insightful connections.

Examples of Knowledge Graphs: Powering Insights

Google has a pretty cool Knowledge graph, known for enhancing search results with contextual insights. Amazon’s Product Graph refines e-commerce recommendations through structured data, while Facebook’s Graph API improves social interactions. DBpedia extracts structured data from Wikipedia, aiding research. Google’s Knowledge Graph stands out as a leading example, revolutionizing search with semantic understanding and very complete results. These Knowledge Graphs uplevel data understanding and relevance across industries.

Synergistic Potential and Limitations

Both LLMs and Knowledge Graphs have their own strengths and limitations. LLMs are really good at capturing language nuances but frankly suck at interpreting complex, structured data. Conversely, while Knowledge Graphs master structured data organization, it’s not as easy for them to understand the nuances and idiosyncrasies of human language. This is why these two domains are “better together”.

Knowledge graph for Large Language Models
Unifying Large Language Models and Knowledge Graphs: A Roadmap; Journal of Latex Class Files, Volume 14, No. 8 August 2021

ArangoDB: Bridging the Gap Between Knowledge Graphs and LLMs

Here, ArangoDB assumes a central role; it’s the bridge that spans LLMs and Knowledge Graphs. As the most complete and scalable graph database, ArangoDB has the added advantage of having “model flexibility”.  This means it can adeptly accommodate LLM capabilities together with the structured insights of more holistic Knowledge Graphs in a very unique and special way.  Why?  ArangoDB is the only graph database available on the market that can incorporate data of various formats (Graph, Document, Full-Text Search, and Key/Value) within a unified platform and supported by a unified query language.

This flexible yet powerful integration of Knowledge Graphs + LLM addresses the limitations of each domain while harnessing their collective strengths, resulting in a comprehensive and intelligent solution. Imagine having to separately query and then integrate disparate data types from multiple database vendors before you can even pair up the data with an LLM.  The LLM is blazingly fast while you have to wait around for the Knowledge Graph to deliver on its half of the bargain!  

Real-World Use Cases for Knowledge Graphs + LLM

To really bring out the power of this marriage, let’s consider some real-world examples and use cases of how this all works in a practical sense.

  • Enhancing Healthcare Diagnosis: In the healthcare sector, think about a scenario where a medical Knowledge Graph contains patient records, research data, and treatment information. By integrating LLMs, practitioners can pose complex queries about patient symptoms. LLMs understand these queries and process them to provide contextually-aware insights, aiding in more accurate diagnoses.  Without an LLM, the usefulness of the Knowledge Graph, while powerful in its own right, cannot come close to reaching its full potential.
  • Refining E-Commerce Recommendations: E-commerce Knowledge Graphs include data about products, categories, user preferences, reviews, and much more. Adding LLMs into the mix can make customer interactions more meaningful and tailored. Analysts can present queries or preferences and the LLMs comprehend and process the inputs.  This makes possible more personalized and sensible recommendations based on the structured data contained in the Knowledge Graph.
  • Uncovering Financial Market Insights: In the financial sector, combining Knowledge Graphs with LLMs enables a better understanding of market trends. By posing questions or analyzing historical data, LLMs interpret queries and rapidly extract insights from the structured financial Knowledge Graph. This lets traders make more informed and timely decisions.  In Financial Markets, especially, acting fast can make all the difference.

Getting Started: ArangoDB & Langchain 

ArangoDB’s commitment to LLMs begins with its integration with LangChain, the de facto Python framework for building LLM applications through composability.

Our integration with LangChain provides ArangoDB users the ability to analyze data seamlessly via natural language, eliminating the need for query language design. By using LLM chat models such as OpenAI’s ChatGPT, Anthropic’s Claude, or Google’s PaLM, users can speak to their data instead of querying it.

As we commit to building the LLM ecosystem for our ArangoDB users, we invite you to take a look at our first steps with LangChain. 

Conclusion: Empowering Holistic Data Understanding

In today’s uncertain and ever-changing technology landscape – especially as it relates to data analytics – ArangoDB emerges as the natural choice in uniting the potential of LLMs and Knowledge Graphs. This integration is key in leveraging the structural depth of Knowledge Graphs alongside the linguistic power of LLMs.  As we move forward, this convergence of capabilities promises to create new opportunities to create innovative applications that address multiple use cases. Just like Arnold Schwarzenegger in the movie True Lies, who would’ve thought data could lead such a double life?

Don’t miss ArangoDB CTO Jörg Schad’s webinar Unifying Minds: Unleashing the Synergy Between LLMs and Knowledge Graphs on Tuesday, August 29th. Register HERE

As always, don’t hesitate to reach out to us at any time. 

The post Bridging Knowledge and Language: ArangoDB Empowers Large Language Models for Real-World Applications appeared first on ArangoDB.

]]>
https://arangodb.com/2023/08/bridging-knowledge-and-language-arangodb-empowers-large-language-models-for-real-world-applications/feed/ 0
Three Ways to Scale your Graph https://arangodb.com/2023/05/three-ways-to-scale-your-graph/ https://arangodb.com/2023/05/three-ways-to-scale-your-graph/#respond Tue, 23 May 2023 09:52:21 +0000 https://www.arangodb.com/?p=42499 As businesses grow and their data needs increase, they often face the challenge of scaling their database systems to keep up with the increasing demand. What happens when your single server machine is no longer sufficient to store your graph that has grown too large? Or when your instance can no longer cope with the…

The post Three Ways to Scale your Graph appeared first on ArangoDB.

]]>

Estimated reading time: 10 minutes

As businesses grow and their data needs increase, they often face the challenge of scaling their database systems to keep up with the increasing demand.

What happens when your single server machine is no longer sufficient to store your graph that has grown too large? Or when your instance can no longer cope with the increasing amount of user requests coming in?

Sharding is the solution.

While you can vertically scale your resources on your single server machine by adding more resources, sharding allows you to horizontally scale your dataset throughout multiple machines.

Screenshot 2023 05 30 at 14.51.56

What is Sharding?

Sharding is a technique used in database management systems to horizontally scale and partition data across multiple servers or nodes. The idea behind sharding is to divide a large dataset into smaller subsets called shards and store each shard on a different server. By doing this, the system can process and manage the data more efficiently and effectively.

Sharding is particularly useful when dealing with large datasets that cannot be stored on a single server due to hardware limitations or performance constraints. By dividing the data into smaller chunks and distributing them across multiple servers, sharding can help improve query performance, reduce response times, and enable systems to scale to handle large data volumes.

Advantages of Sharding in ArangoDB

The sharding aspect in ArangoDB offers several advantages, including:

  • Improved Performance: By distributing data across multiple servers, ArangoDB can process queries faster, reduce response times and improve overall performance.
  • Scalability: Sharding allows ArangoDB to scale horizontally by adding more servers to the cluster. This enables the system to handle larger datasets without compromising performance.
  • Fault Tolerance: ArangoDB’s sharding architecture provides fault tolerance by ensuring that data is replicated across multiple servers. This means that if one server fails, the system can continue to function without data loss.

Sharding in ArangoDB is based on shard keys. To determine which shard a document should be stored in, ArangoDB uses a hash-based sharding mechanism. When a document is inserted into a collection, ArangoDB hashes the value of the document’s shard key. The shard key is a user-defined field that specifies how the data should be partitioned. By default this is the _key attribute of a document.

Graph Basics

A graph in ArangoDB consists of a Graph Definition. In short, the Graph Definition itself gets a descriptive name and a list of participating collections. Collections, which are the containers for the documents themselves, are either of the type called Document Collection or Edge Collection. The data nodes are being stored in the Document Collection, the relations connecting those nodes are called edges and are being stored in Edge Collections.


An edge is a relation that connects two nodes with each other. In ArangoDB an edge is always persisted with a fixed direction, but can be accessed in any direction within graph queries. To get all details and option parameters about Graphs, Vertices, Edges and Graph Definitions in ArangoDB itself, please read our full documentation. In case you want to learn more about graphs in general, please refer to this blog article.

Example Graph Concept

To help you understand the major differences between single server instances and clustered environments, we’ll introduce a simple example graph concept we’ll make use of throughout this blog post.

Our example graph consists only of a limited number of nodes to keep it simple. Those nodes are connected via edges. On a single server instance all of that data would be stored on a single machine.

image 3

Challenges with Distributed Graphs

On a single server instance, all data we have is stored locally. For any graph search, whether it is a Traversal (DFS, BFS or Weighted) or a Shortest Path (Weighted, K-Paths or All-Paths) calculation, we achieve the best performance as we do not need to perform any additional network requests or manage collections being split into shards onto different database servers. This statement is true until the machine’s capacity is reached.

Therefore we need to organize our graph in such a way that we can scale out to multiple machines. At the same time, we must organize our data in a smart way so that we can run as many computations on the database servers as possible (e.g. to allow parallel computations). This optimization will then lead to a reduced amount of network communication, as network requests are usually one of the main reasons to make the performance worse.

The better the graph data itself is organized, the better the query performance will be. Better data organization means, we try to achieve the best data locality we can get as this will drastically reduce the total amount of network requests we need to do, while at the same time we can fan out to multiple machines and continue our graph algorithm there in parallel. 

In the upcoming sections we will cover the different types of graph concepts in ArangoDB with a special look at their distribution in a clustered environment and their sharding characteristics.

Sharded Graphs Concepts

ArangoDB offers multiple graph types to create and handle graphs in a clustered environment with different sharding strategies. All types are based on the same manner using Graph Definitions.

The Community Edition of ArangoDB includes:

  • General Graph

The Enterprise Edition of ArangoDB adds two major graph types on top:

  • EnterpriseGraphs
  • SmartGraphs

(ArangoDB offers even more specialized graph types we’ll dig into in an upcoming blog article.)

All of them support various configuration parameters during creation, e.g.:

The amount of Shards used per Collection, which will then automatically be applied to any Document or Edge Collection which is part of the Graph Definition. It describes the number of logical partitions (Shards) that a collection will be divided into.

The Replication Factor, which specifies the number of replicas that should be created for each shard in a database cluster.

The Write Concern, which specifies the minimum number of database servers or replicas that must acknowledge a write operation before the operation is considered successful.

General Graph

The General Graph is the basic graph type in ArangoDB that allows you to create and manage graphs. It is suitable for small-scale graph use cases and does not require any specific configuration or setup. In a General Graph, the data will be randomly distributed across all configured machines and each machine will take an equal portion of data. It is very easy to realize as no knowledge about the data is required. This graph type will always work, but there will be a few disadvantages with this approach. As we distribute randomly, Neighbors will land on different machines. Therefore, edges will also with high probability end up on other servers as their connected nodes. The worst case which can occur is that in a relation (Bob) ⇒ (Alice), Bob will land on database server A, the relation itself will land on database server B and Alice will land on database server C, which then leads to a lot of required network requests. This is then finally reflected in a lack of query execution performance.

Screenshot 2023 05 30 at 7.50.31 AM 1

EnterpriseGraph

The EnterpriseGraph is a more advanced type of graph in ArangoDB that is designed to support large-scale graph use cases in enterprise environments. EnterpriseGraphs will allow you to create graphs at scale with automated sharding key selection.

While the data itself is also “randomly sharded” (like in General Graphs), this specific graph type ensures that all edges adjacent to a vertex are co-located on the same server. This approach provides significant advantages as it minimizes the impact of having suboptimal sharding keys defined when creating the graph. It will give a vast performance benefit for all graphs sharded in an ArangoDB Cluster, reducing network hops substantially as more graph calculations can be done on the database nodes itself.

The only consequence here is, that you cannot define custom _key values on edges, as ArangoDB calculates its value based on the shard key – and this specific calculation is being done by the EnterpriseGraph graph type automatically. EnterpriseGraphs do store some additional meta information which leads to a slightly increased amount of disk usage compared to GeneralGraphs.

Heikos REVISED EnterpriseGraph image for blog 5.23 1

SmartGraph

While EnterpriseGraphs are already improving the data distribution across multiple database servers, SmartGraphs are optimizing data distribution even further. Graphs in general know nothing of themselves. But, your application knows a lot about the graph. In many datasets there are highly interconnected communities, but few connections between these communities. For instance, a set covering your customers, regions or any other logic you apply to organize your graph at the application layer can in turn be used in sharding the graph through the cluster.

Think about a social network. In a social network, which connects people through relations, it is more likely that a person has more regional friends, relatives or followers. From this knowledge, we can choose the ideal property to define the graphs sharding. This property is called the smartGraphAttribute and needs to be defined for a SmartGraph in ArangoDB. This results in highly-connected communities landing on the same database server, which will improve query execution times compared to EnterpriseGraphs even more. A simple performance comparison between a General Graph and a SmartGraph example can be found here.

The storage size of SmartGraphs is comparable to the storage size of EnterpriseGraphs. At most it will be equal to the size of an EnterpriseGraph. At best it is close to the resource consumption of a General Graph.

Conclusion

All the graph types ArangoDB offers have several advantages and disadvantages. All of them share the ability to scale your graph to multiple machines in a clustered environment. You can always start exploring using the General Graph

If the performance does not fit your needs – try to use EnterpriseGraph, as you don’t need to manually take care of the graph data distribution. You should directly see better query execution times.

In case your requirements are still not met yet, you should start to organize your graph dataset in a smarter way. While you should always think about an ideal way of modeling your graph, the next graph type enforces you to define a property. Think about a good way to structure your graph into multiple assets with the primary goal to use SmartGraphs – which then will drastically improve the performance as then we can organize the data throughout all database servers in the best way possible. 

Therefore, think carefully which graph type fits best to the schema of your graph and its dataset. Ideally, you choose the best fitting graph type for your use-case in advance, as converting to another graph type might cause extra effort (find details about graph migration here).

Next

Want to even learn more about advanced graph sharding strategies? Stay connected. A more advanced blog post will follow soon about OneShard Graphs, Disjoint SmartGraphs, Hybrid SmartGraphs and Hybrid EnterpriseGraphs and SatelliteGraphs, which allow you to fine-tune graph sharding to the next level.

Call to Action

Thank you for reading!

Feel free to connect and ask us any questions @ https://www.arangodb.com/community/.

Also don’t forget to check our free-trial on our cloud offering, ArangoGraph Insights Platform: https://www.arangodb.com/download/#try-cloud

The post Three Ways to Scale your Graph appeared first on ArangoDB.

]]>
https://arangodb.com/2023/05/three-ways-to-scale-your-graph/feed/ 0
May 2023: What’s the Latest with ArangoDB? https://arangodb.com/2023/05/arangodb-newsletter-151/ https://arangodb.com/2023/05/arangodb-newsletter-151/#respond Thu, 04 May 2023 09:51:03 +0000 https://www.arangodb.com/?p=42426 Welcome to the May ArangoDB newsletter. Thank you for reading! 📖  Here are some of the things we’re excited to share with you this month: Upcoming Webinar: ArangoDB 3.11 release The GA release of ArangoDB 3.11 is imminent 🥳 Join ArangoDB CTO Jörg Schad in our upcoming webinar to learn more about the latest version…

The post May 2023: What’s the Latest with ArangoDB? appeared first on ArangoDB.

]]>

Estimated reading time: 4 minutes

Welcome to the May ArangoDB newsletter. Thank you for reading! 📖 

Here are some of the things we’re excited to share with you this month:

Upcoming Webinar: ArangoDB 3.11 release

Copy of Graph Done Right Graph Machine Learning or the Future of AI Twitter Post

The GA release of ArangoDB 3.11 is imminent 🥳 Join ArangoDB CTO Jörg Schad in our upcoming webinar to learn more about the latest version of ArangoDB firsthand. 

RSVP today to save your spot. 

Wednesday, May 31st, 2023 – 11:00am PT / 2:00pm ET / 8:00pm CEST

Register now ->


From the Blog: Combat Fraud with Graph

Fraud is one of the most significant issues facing businesses today, resulting in more than $3.7 trillion in annual losses (Murphy, 2022). Fraud comes in numerous forms, including but not limited to money laundering, identity theft, account takeover, and payment fraud. One technology that is increasingly being used to detect and prevent fraud is graph databases. 

Check out this blog post to learn how graph databases can help with fraud detection.

Read the blog ->


Watch On-Demand: Cyber Security at Finite State with ArangoDB

Are you struggling to defend your organization against increasingly sophisticated cyber attacks? Then check out our recent webinar, Cyber Security at Finite State with ArangoDB, to:

  • Learn how graph database technology can be used to combat cyber threats
  • Discover how Finite State uses ArangoGraph to address cyber threats
  • Understand why ArangoDB is a powerful tool for analyzing and visualizing cyber threat data
  • See a demonstration of how ArangoGraph, our cloud-based graph data and analytics platform, can be used for knowledge graph and intrusion and anomaly detection

Watch on-demand ->


New Case Study: Global Relay

Global Relay is a leading provider of compliant electronic communications archiving, messaging, supervision, information governance, and eDiscovery to 20,000+ customers in 90 countries. In this case study, you can learn how ArangoDB replaced Elasticsearch in order for Global Relay to scalably incorporate contextual relevance into its intuitive directory search, allowing its customers to seamlessly find who they want to communicate with.


Read the case study ->


Community Spotlight: Five driver tutorials added to ArangoDB University

Our course catalog at ArangoDB Univeristy continues to grow with the addition of five new driver tutorials. Enroll today to learn how to use ArangoDB as the backend for your application from C#/.NET, Go, Java, and Python:

ArangoDB University is FREE and aims to help you level up your graph, ArangoDB, and AQL skills with interactive online courses, all powered by our managed service ArangoGraph Insights Platform.

What are you waiting for? Enroll today!

Register now ->


Make 2023 the Year for Cloud: Try ArangoGraph Insights Platform!

ArangoDB Graph New (1)

Our next-generation graph data and analytics platform, ArangoGraph, is the easiest way to run ArangoDB in the cloud. It provides all the features of ArangoDB Enterprise Edition, with the ability to elastically scale up (or down!) to cost-effectively meet your needs – not to mention expert support by the people who built ArangoGraph and ArangoDB’s distributed systems.

Check it out! Sign up for your free, 14-day trial today.

Start free trial ->


From the Avocado Grove: We’re Hiring!

Despite the current economic conditions, Team Avocado continues to grow. 🥑 We have the following roles open at ArangoDB:

If you are interested, please apply! Or if you know someone who’s looking, please pass along. 🙏


Can we reward you with a $25 Amazon gift card?

How would you like to earn a $25 Amazon gift card? How, you ask? It’s easy! Simply click on the link below and give us an honest review. 

Write a review now and earn a $25 gift card! ->


We hope you enjoyed our latest news!

Until next month,
The ArangoDB Team 🥑

The post May 2023: What’s the Latest with ArangoDB? appeared first on ArangoDB.

]]>
https://arangodb.com/2023/05/arangodb-newsletter-151/feed/ 0
Graph and Entity Resolution Against Cyber Fraud https://arangodb.com/2023/04/graph-and-entity-resolution-against-cyber-fraud/ https://arangodb.com/2023/04/graph-and-entity-resolution-against-cyber-fraud/#respond Mon, 03 Apr 2023 14:37:56 +0000 https://www.arangodb.com/?p=42305 With the growing prevalence of the internet in our daily lives, the risks of malware, ransomware, and other cyber fraud are rising. The digital nature of these attacks makes it very easy for fraudsters to scale by creating thousands of accounts, so even if one is identified, they can continue their attacks.In this blog post,…

The post Graph and Entity Resolution Against Cyber Fraud appeared first on ArangoDB.

]]>

Estimated reading time: 4 minutes

With the growing prevalence of the internet in our daily lives, the risks of malware, ransomware, and other cyber fraud are rising. The digital nature of these attacks makes it very easy for fraudsters to scale by creating thousands of accounts, so even if one is identified, they can continue their attacks.
In this blog post, we will discuss how graph and entity resolution (ER) can help us battle these risks across different industries such as healthcare, finance, and e-commerce (for example, the US healthcare system alone can save $300 billion a year with entity resolution). You will also receive hands-on experience with entity resolution on ArangoDB.

What is Entity Resolution?

Entity resolution, which is also referred to as record linkage or deduplication, is a technique used to identify and merge similar or identical entities from multiple data sources into a single record. Imagine, for example, a fraudster creating many thousands of accounts across different services. Entity resolution can help match these virtual records and resolve them into one entity or record. 

Entity resolution can be used for a variety of purposes, such as identifying duplicate customer records in a marketing database, matching medical records to patients, detecting fraudulent activities by identifying multiple identities of the same person, or linking social media profiles to a single individual. Entity resolution can improve the accuracy and completeness of data by reducing redundancy, eliminating errors, and creating a unified view of the data. It can also be used to facilitate data integration and analysis, as well as support various applications such as recommendation systems, personalized marketing, and fraud detection.

User accounts: Merging multiple user entries into one entity.

Graph to the Rescue 

Entity resolution is a challenge as there are typically no distinct keys identifying each entity.

Let us revisit the above example of user accounts and experience one of my daily frustrations with the German umlaut in my first name. Depending on the document or account, it is either spelled Joerg or Jörg (let us not get even get started on pronunciation). Matching different accounts is a challenge because the name is not a unique identifier. This is often the case even without weird letters; just imagine J. Smith vs Joerg S.

Luckily, in most real world scenarios we have additional contextual information, such as emails, phone numbers, addresses, and employers. Here is where graph comes into play, as this information can be easily assembled as a graph. Note that usually the context won’t 100% match, e.g., I have multiple email addresses and phone numbers, moved around quite a bit, and worked for different companies, so vary rarely will all entities have all potential information categories (e.g., the account Joerg S. on the right side does not have connected employer information). Still, we can treat each individual piece as evidence and then compute a similarity score.

In practice, this can be done by representing the neighborhood of a user (i.e., a node in the graph) with similarity measures such as cosine similarity or Jaccard distance. Depending on the data, all nodes above a certain threshold are considered the same entity. Feel free to try yourself using this Jupyter notebook.

There are also other techniques, including graph machine learning, but we will cover that in more detail in another blog post.

How to Battle Cyber Fraud with ER

Entity resolution is used as a tool for cyber security by identifying and linking together various pieces of related information that can be used to catch fraudulent activity. This can include IP addresses, emails, or other means of device authentication. By analyzing these and other similar data points, entity resolution can help investigators detect the true identity of the fraudster and track their activities across multiple accounts or platforms.

There are several ways entity resolution can be used for cyber security. A few of these use cases include fraud detection, risk assessment, and identity verification. Some examples of entity resolution being put into play are identifying indicators of fraud, analyzing the risk associated with a particular transaction or user account, and verifying the identity of users through different identification methods. 

Imagine a social media platform is trying to remove duplicate or fraudulent accounts. While there might be multiple accounts that go under the name Jane Foster, there would be a limited number of those accounts that would share a similar birth date. If you continue to narrow down your results, there may be 3 accounts under the name Jane Foster that use the same IP address and mobile device to connect to the duplicate accounts. This can also be used to identify spam accounts by associating devices and IP networks. 

Learn More About Entity Resolution with ArangoDB

In this blog, we have discussed how graph and entity resolution can help with both cyber fraud and names with German umlauts! If you want to learn more about these topics take a look at the following additional resources.

Lunch + Learn Session

Entity Resolution in ArangoDB Blog Post

Fraud Detection with ArangoDB Webinar:

https://hopin.com/events/fraud-detection-with-arangodb

Jupyter Notebook

https://colab.research.google.com/github/arangodb/interactive_tutorials/blob/master/notebooks/EntityResolution.ipynb

The post Graph and Entity Resolution Against Cyber Fraud appeared first on ArangoDB.

]]>
https://arangodb.com/2023/04/graph-and-entity-resolution-against-cyber-fraud/feed/ 0
Combat Fraud with Graph https://arangodb.com/2023/03/combat-fraud-with-graph/ https://arangodb.com/2023/03/combat-fraud-with-graph/#respond Wed, 15 Mar 2023 11:59:58 +0000 https://www.arangodb.com/?p=42220 Fraud is one of the most significant issues facing businesses today. While companies have always faced fraud, detecting fraudulent activity has become even more challenging due to increased online transactions. Globally, fraud results in more than $3.7 trillion in annual losses (Murphy, 2022). Fraud comes in numerous forms, including but not limited to money laundering,…

The post Combat Fraud with Graph appeared first on ArangoDB.

]]>

Estimated reading time: 5 minutes

Fraud is one of the most significant issues facing businesses today. While companies have always faced fraud, detecting fraudulent activity has become even more challenging due to increased online transactions. Globally, fraud results in more than $3.7 trillion in annual losses (Murphy, 2022). Fraud comes in numerous forms, including but not limited to money laundering, identity theft, account takeover, and payment fraud. Due to the variety of ways companies can face fraud, they must have a system to protect themselves and their customers.

One technology that is increasingly being used to detect and prevent fraud is graph databases. This blog post will explain graph databases and how they can help with fraud detection.

What are Graph Databases?

A graph is a collection of nodes (points) and edges (lines) where the edges describe the relationship between the nodes. Graphs can be explored through graph theory, analytics, and database models. This database form is considered the next step for data and analytics to get the most out of their delivery. Graph databases give a way to organize and present data for use cases previously considered difficult or complicated to address appropriately.

A graph database stores the data and its natural relationships as a graph of nodes and edges instead of disconnected rows and columns in a table that you would see in a traditional relational database. 

Graph databases have built-in algorithms for standard graph analytics functions such as Shortest path, k shortest paths, k paths, and all shortest paths.

Graph Database vs. Fraud Detection

Graph databases are ideal for fraud detection because they can quickly and efficiently identify patterns and connections between seemingly unrelated entities. Here are a few ways in which graph databases can help with fraud detection:

Relationship Analysis: Fraudulent activity often involves multiple entities, such as a fraudster, a victim, and a middleman. Graph databases can identify behavior patterns indicative of fraud by analyzing the relationships between these entities. For example, if a fraudster uses multiple email addresses to create fake accounts, a graph database can identify them and link them to the same person.

Real-time Detection: Graph databases can process large amounts of data in real time, making them ideal for detecting fraud as it happens. By continuously analyzing data streams, graph databases can detect behavior patterns indicative of fraud and trigger alerts or block transactions.

Machine Learning: Graph databases can be integrated with machine learning algorithms to improve fraud detection. By training machine learning models on historical data, graph databases can identify behavior indicative of fraud and use those patterns to predict future fraud.

Learn more about Machine Learning with ArangoDB and ArangoGraphML here.

Centralized Data Management: Graph databases can provide a single source of truth for fraud detection data. Companies can easily track and analyze fraud patterns across different systems and departments by collecting and storing all fraud detection data in a graph database.

Entity Resolution for Fraud Detection

Entity Resolution is a critical tool in combating fraud. Entity resolution finds duplicates of entities across multiple systems on-prem and in the cloud. This allows admins to have a clear view of data and to sort through different data types such as date, contact, email, address, email, device, or any additional unique identifier. 

When applied to fraud detection, entity resolution allows admins to find duplicates of fraudsters within their system by cross-referencing these unique identifiers. This will enable fraudsters running different scams through similar networks to be flagged and taken down.

ArangoDB as a Graph Database

ArangoDB goes beyond graph by being a graph store that natively incorporates capabilities from other data models, including key-value, document, search, and more. The graph capabilities of ArangoDB are similar to a property graph database but add more flexibility in data modeling as vertices and edges are both full JSON documents.

Due to this natively integrated support, users can take the result of a JOIN operation, geospatial query, text search, or any other access pattern as a starting point for further graph analysis and vice versa – all in one query, if needed.

ArangoDB is the underlying database for ArangoGraph Insights Platform, a cloud-based graph data, and analytics platform.

Fraud Detection with ArangoDB

Today’s criminals are developing new techniques to hide their activities by forming fraud networks with stolen or synthetic identities. Attacks are often launched from multiple vectors and can only be discovered by connecting diverse data sources to uncover difficult-to-detect patterns. Native graph technology is perfect for solving this challenge.

A graph database is a suitable solution as it decreases the time needed to process fraud detection queries against the database and delivers simple data visualizations to analysts. It also removes false positives as real customers wait for the money (customer satisfaction and lost revenue). Fraud detection is a great use case for a graph database, as relational databases are too slow and complex to query in real-time.

Try out our Fraud Detection guide on ArangoGraph Insights Platform → https://cloud.arangodb.com/home?utm_campaign=2023%20Fraud%20Campaigns&utm_source=fraud%20detection%20blog

Conclusion

Fraud detection is a critical aspect of modern business, and graph databases can help companies detect and prevent fraud by identifying patterns and connections in complex and interconnected data. By leveraging the power of graph databases, companies can improve their fraud detection systems and better protect themselves and their customers.

Additional Resources

Read “Identifying Fraud at Scale with ArangoDB”https://www.arangodb.com/resources/white-paper/fraud-detection/?utm_campaign=2023%20Fraud%20Campaigns&utm_source=fraud%20detection%20blog

Did you miss our “Fraud Detection with ArangoDB” webinar? Watch it on demand today! → https://hopin.com/events/fraud-detection-with-arangodb?utm_source=Blog%20Post&utm_campaign=Fraud%20Detection%20Campaign

The post Combat Fraud with Graph appeared first on ArangoDB.

]]>
https://arangodb.com/2023/03/combat-fraud-with-graph/feed/ 0