Application of graph databases in recipe recommendations

Graph databases

Common use cases

Social application scenarios: friend recommendation, social relationship combing (fan attention), how to know Trump through the least number of people?
Knowledge Graph: Put the confused network of relationships and quickly search for what you need
Recommended Scenarios:

Personalized recommendations for retail, e-commerce, content, etc., explore user needs, and improve user experience
Navigating Urban Planning: The Shortest Path
Finance: Asset trading charts
Node portraits: device portraits, user portraits, etc

Basic concepts

Node (V): People, movies, recipes, ingredients
Node attribute (VP): The person's age, height, age, and occupation
Relationship (E): The relationship between people and movies (likes, likes, ratings, etc.)
Relational Attribute (EP): The score of a person's rating of a movie
Label: The user and device that classify nodes and relationships

It is more suitable for relational databases than relational databases

Body: The user --- the recipe

Common recommendation logic

item ==> item

Recommend similar recipes based on content (ingredients, nutrition, labels, etc.)

user ==> item ==> user

Recommend creation records to recommend similar users

user ==> item ==> user ==> item

Recommend his usual recipes based on the user with the highest similarity

Commonly used recommendations in graph database libraries

Content filtering
Collaborative filtering

Cosine similarity
Pearson similarity

Recommended in the field

Content recommendations

As the name suggests, it is a recommendation based on the content

Recipe ingredients similar content recommendation: I2I

Chicken stew with mushrooms and chicken stew with shiitake mushrooms are the same ingredients

You can find recipes sorted with the same ingredients as chicken stew with shiitake mushrooms

You can also use recipe labels, recipe nutrition facts, and more as recommendations

Jaccard u2i2u

Official:

Recommend similar users based on users who have made the same recipe

Union: This is the full recipe

Intersection: Get recipes that have been made in the same way

Cosine similarity u2i2u

Official:

Sample data:

User A and User B calculate the cosine similarity

Collection Cosine: (Correct Cosine Similarity)

Intersection Cosine: (with some missing cosine similarity)

Due to the merger of the graph database logic and the parameter logic, a part of the value of the cosine similarity calculation method is lost

Accuracy issues

What about A and C?

After making the same recipe too

Expand the recipe sample

That is to say, it only happens to users who have made a small number of new users, that is, when the collection is small, when the recipe sample is expanded and after the user has made the same recipe, the difference between the two effects will be reduced, that is, the more users make recommendations, the more accurate they are. For users who make it often, the template deviation will be reduced.

Pearson similarity u2i2u

Official:

Sample data:

User A and User B are counted as Pearson similarity

Null values are expected to be the average number of times the user has made them

Collection Pearson:

Intersection Pearson:

Compare cosine similarity:

The difference between the two formulas is the null expectation (zero, average), and the formulas can be extrapolated from each other

Recommended u2i2u2i in the field

Recommended recipes based on the users with the highest similarity (using cosine similarity as a reference)

Similarity between users:

User A ---> User B--->0.617

User A ---> User C--->0.954

User A ---> User D--->0.796

==》Fetch the first two similar users C and user D

==》红烧肉 =8x0.954+0x0.796=7.632

==》辣椒炒肉 =1x0.954+1x0.796=1.75

==》, that is, the weighted sum is sorted, and the weighted sum is in front

Good Recommendation Expectation:

Feature-specific, personalized recommendations
The results are precise
Fast performance

In the actual recommendation scenario, functions, effects, and performance complement each other and affect each other, and many times a compromise needs to be considered

1. Select the similarity algorithm problem

Performance: Cosine similarity >> Pearson similarity

Accuracy: Cosine similarity < Pearson similarity

In the initial stage of Alibaba Cloud GDB, we will use cosine similarity as the similarity coefficient between users

2. The quantity level is too large, and the full amount is too slow (data filtering)

Recipes are recommended based on the users with the highest similarity

Adopt a data filtering approach

10,000/100,000/million ==> Find the person with the most recipes for the same making

Thousand/10,000==> Collaborative filtering

Ten/100==> Weighted sum

3. Cold data issues

Fourth, timeout truncation

After the timeout, select the hot data to make up the data

作者:xiaospace

Source-WeChat public account: Joyoung technical team

Source: https://mp.weixin.qq.com/s/cq5GNDeT495xVwdAUTavvQ

Application of graph databases in recipe recommendations

Read on

With an annual growth of 130% customers and support for more than 100 zones, OceanBase has opened up the cloud database market in two years

GBASE helps the telecom industry, opening a new chapter in the localization of databases

Database storage has a knack! It's not enough to understand technology, these modes make you be praised by the leader!

GVR is used to build a GBase database active-active cluster disaster recovery solution

A detailed explanation of the use of ontape for backup and restoration of the NTU general GBase 8s database

GBase database won the ICT China (2024) Outstanding Case Award for anti-fraud applications in the telecommunications field

In just 7 steps, you can simulate Kali Linux to get more database information

A guide to advanced vocabulary in programming development: core terms for architecture, concurrency, databases, and containerization

Vivo's new wireless headphones appear in the database and are expected to be launched globally

Gone are the days when Oracle, which has monopolized China's database for decades and enjoyed sky-high fees

In-depth understanding of the blockade protocol and transaction scheduling of the NTU general GBase 8s database

GBASE Database High-Availability Cluster: Ensuring Business Continuity

Oracle Data Recovery—The Oracle database file is corrupted, and an error data recovery case is opened

Moentropy Patent Database: The Preferred Tool for Biomedical Patent Information Retrieval and Analysis!

How are database giants all in generative AI? Interview with Oracle's Vice President

The "Fuxi Eye" large model was released! It has the world's largest ophthalmic image database