User login

Union of 2 tables that share several column

13 posts / 0 new
Last post
Nico1990
Offline
Joined: 02/17/2012
Union of 2 tables that share several column

Hi,

Sorry if this subject has already been treated, but I can't fin it.

I have 2 tables, they have 2 identical column, the third is different. They don't have the same number of rows. I'd like to make a "union" using the common columns, add the different columns and get all the data in one table.

So I use the "union" node and select the common columns, but it doesn't work as I wish. I think, the easiest way to explain the result is an example.

Table1: 

 

id

smile

ic50

mol1

c1ccccc1

258

mol2

c1ccncc1

123

mol3

ccocc

987

Table2:

 

id

smile

origin

mol1

c1ccccc1

extern

mol2

c1ccncc1

intern

mol4

C1CCCCC1

extern

mol5

ch4

intern

Result:

 

id

smile

ic50

origin

mol1

c1ccccc1

258

extern

mol2

c1ccncc1

123

intern

mol3

ccocc

987

?

?

?

?

extern

?

?

?

intern

My expectation:

 

id

smile

ic50

origin

mol1

c1ccccc1

258

extern

mol2

c1ccncc1

123

intern

mol3

ccocc

987

?

mol4

C1CCCCC1

?

extern

mol5

ch4

?

intern

I hope it's clear. I don't see any other solution.

Thanks for your help

Nico

gabriel
gabriel's picture
Offline
Joined: 01/26/2007

Sounds like a problem for the Joiner node, but I am not sure if this helps in this case. I would recommend to have a look into the Reference Column Filter node. This node will give you all column that match or don't match with a so-called reference table. The result need to be "merged" with the original table. How would you merge rows that appear in one but not in the other table?

richards99
richards99's picture
Offline
Joined: 12/06/2010

I would use the Joiner node, and choose to Join by the smiles column (so change it from RowID to smiles)

You choose how you want the behaviour of the join to operate.

If you only want the output to contain a table where the molecules where in both tables, choose Inner Join. If you want all entries in the output, regardless of whether there was a match in both tables or not, choose Full Outer Join.

All other columns will be automatically appended, but if you only want specific columns, go to the "Column Selection" tab in the Joiner node and hand pick the desired columns.

 

Hope this helps,

Simon.

Nico1990
Offline
Joined: 02/17/2012

Hi,

Firstly, sorry for my previous post, I have used the table fonction to make my example, the display is very bad. You could see below a better representation of my cases.

Table1 (left table):  

id, smiles, ic50

mol1, c1ccccc1, 258

mol2, c1ccncc1, 123

mol3, ccocc, 987

Table2 (right table):

id, smiles, origin

mol1, c1ccccc1, extern

mol2, c1ccncc1, intern

mol4, C1CCCCC1, extern

mol5, ch4, intern

Result:

id, smiles, ic50, origin

mol1, c1ccccc1, 258, extern

mol2, c1ccncc1, 123, intern

mol3, ccocc, 987, ?

?, ?, ?, extern

?, ?, ?, intern

My expectation:

id, smiles, ic50, origin

mol1, c1ccccc1, 258, extern

mol2, c1ccncc1, 123, intern

mol3, ccocc, 987, ?

mol4, C1CCCCC1, ?, extern

mol5, ch4, ?, intern

Then, I have tried several ways to join my 2 tables (using several joining columns, selecting differents number of columns from the left or the right table...), but each time I lose the data from the right table.

The better thing I get is when I duplicate the columns I want.

Something like this:

id, smiles, ic50, origin, id(*), smiles(*)

mol1, c1ccccc1, 258, extern, mol1, c1ccccc1

mol2, c1ccncc1, 123, intern, mol2, c1ccncc1

mol3, ccocc, 987, ?, mol3, ccocc

?, ?, ?, extern, mol4, C1CCCCC1

?, ?, ?, intern, mol5, ch4

It's not very nice...

 

 

 

 

richards99
richards99's picture
Offline
Joined: 12/06/2010

As far as I can tell simply selecting the full outer join option in the joiner node should give the desired output. Select smiles columns for the join.

Simon.

Nico1990
Offline
Joined: 02/17/2012

I tried, but that didn't solve the problem

gabriel
gabriel's picture
Offline
Joined: 01/26/2007

Did you select the "duplicate" column that you want to include from the second table within the dialog tab "Column Selection"?

Nico1990
Offline
Joined: 02/17/2012

Yes the "always include all columns" options are checked for both tables. I have tried the 3 different ways of handling duplicate column and it's not better.

gabriel
gabriel's picture
Offline
Joined: 01/26/2007

Next try. I guess you are complaining about the join columns which contain missing values (?) in this case no match in the left/right table was found. This is a known limitation of the Joiner and can be fixed using the Column Merger afterwards.

s.roughley
Offline
Joined: 03/15/2011

Did you un-check the 'filter joined columns' boxes for the left and right tables?

Nico1990
Offline
Joined: 02/17/2012

Yest I did

Iris
Iris's picture
Offline
Joined: 05/29/2010

And, is it working using the Column Merger?

You have to do it for each ColumnPair separately.

Nico1990
Offline
Joined: 02/17/2012

The "merger column" node solves the problem (I mean it hides the problem of duplication columns).

Thank you to all of you!

Nico