Thursday, July 9, 2009

Related Product by Tag words - Solution

I have got a nice solution for the tag keyword and related post problem.

Here I want to illustrate the problem first. Suppose you have few books to post with some keywords.

C++ in 7 days

Book, Programming, computer, C

Advance MySql

Book ,Programming, computer, database

Harry Potter

Novel, Fantasy,

Anatomy handbook

Book, Medical

Advance C

Programming, computer, C

Now whenever I want to view the related books of c++ in 7 days. The books will be as following as the tag weight

1. Advance C

2. Advance MYSql

3. Anatomy book.

But the problem I described at the previous post is I have to query each time for each tag. To solve this problem I used the ‘TOXI Solution’ Method.

I split the books and tags into two different db tables. And they have many to many relationships (m n).

Book: bookId(PK), Book Name

Tag: tagId(PK), tagWord

And to develop the relationship another relational table

TagMap: Id(PK), bookId(FK), tagId(FK)

Now the query will be

SELECT b.*,@p1 = ''

FROM TagMap bt, Book b, Tag t

WHERE bt.tagId = t.tagId

AND (t.tagId IN (SELECT @p1 := tagId FROM TagMap WHERE bookId = '1'))

AND = bt.bookId AND <> '1'


HAVING (COUNT( )<= count(@p1) AND COUNT( >= 1) LIMIT 0,3

Wednesday, July 8, 2009

Related Product by Tag words

Recently I am doing a social networking project. In this project there is a option to get related products with given tags. Whenever I was doing the case study of this project i didn't give that much emphasize on the tag and related product. I thought about phew this is nothing. But at the time I involve myself to the database design I found oh my god what I thought about the tag word.

I just want to describe the problem here. In real time scenario I thought that I will pick each tag and search the product which also has the same tag. And after that i will count the maximum matched products. And show as related product. But after giving a few light over the problem I found a nice problem. I thought there could be a product with 32 or more then 32 tags. Even it could be 200. [Later I will decide it will be limited up to 32]. And if the ‘tag table’ consists fifty thousand records. Then I have to search full database 32 times or more than that. Then I have to calculate the whole result. If i think about n tags for a product then I have to query n+1 times. Just to show 2 or 3 related products.

I was thinking about reducing the query number. As my boss I was thinking about FULL TEXT Search. But I asked myself how delicious solve the same problem. They have much more data than I imagine. Later all credits goes to Google. I found few methods to solve the issue. But I am not even getting what are actually the databases experts want to say. I will discuss later how I solve this problem.

Here I attached some url where I got some ideas about Tag schema Design and queries.

Tags Database Schemas

Tag System: Performance tests

Tag Schema