Division des valeurs dynamics dans deux colonnes en plusieurs colonnes – Suppression des duplicates

J'ai du mal à créer une requête qui peut split plusieurs valeurs dans une colonne en plusieurs colonnes pour aider à "dédupliquer" un set de données.

Mieux expliqué dans datatables ci-dessous, mais fondamentalement, vous remarquerez un champ d'intervalle qui est un RANK DENSE à travers les colonnes ID, START, FINISH, DURATION, COD. En raison de chevauchements multiples des valeurs PSSID et CSSID, ces intervalles sont dupliqués. J'aimerais savoir s'il existe un bon moyen de split dynamicment les champs PSSID et CSSID qui se chevauchent en plusieurs colonnes …! Ok, alors qu'est-ce que je veux dire en fait …

EXEMPLE DE DONNÉES:

ID START FINISH DURA COD INT PSSID CSSID A1 33.18 33.27 0.09 ST 15 N13045 NULL A1 33.18 33.27 0.09 ST 15 N13046 NULL A1 33.27 33.285 0.015 DU 16 N13046 NULL A1 33.27 33.285 0.015 DU 16 NULL N20015 A1 33.27 33.285 0.015 DU 16 NULL N2001516 A1 33.27 33.285 0.015 DU 16 NULL N20033 A1 33.285 33.35 0.065 BM 17 N13046 NULL A1 33.285 33.35 0.065 BM 17 NULL N20015 A1 33.285 33.35 0.065 BM 17 NULL N2001516 A1 33.285 33.35 0.065 BM 17 NULL N20033 A1 33.35 33.395 0.045 DM 18 N13046 NULL A1 33.35 33.395 0.045 DM 18 NULL N20015 A1 33.35 33.395 0.045 DM 18 NULL N2001516 A1 33.35 33.395 0.045 DM 18 NULL N20033 A1 33.395 33.44 0.045 DN 19 N13046 NULL A1 33.395 33.44 0.045 DN 19 NULL N20015 A1 33.395 33.44 0.045 DN 19 NULL N2001516 A1 33.395 33.44 0.045 DN 19 NULL N20033 A1 33.44 33.485 0.045 BM 20 N13046 NULL A1 33.44 33.485 0.045 BM 20 NULL N2001516 A1 33.44 33.485 0.045 BM 20 NULL N20033 A1 33.44 33.485 0.045 BM 20 NULL N20034 A1 33.485 33.51 0.025 DN 21 N13046 NULL A1 33.485 33.51 0.025 DN 21 NULL N2001516 A1 33.485 33.51 0.025 DN 21 NULL N20033 A1 33.485 33.51 0.025 DN 21 NULL N20034 A1 33.51 33.595 0.085 DB 22 N13046 NULL A1 33.51 33.595 0.085 DB 22 NULL N2001516 A1 33.51 33.595 0.085 DB 22 NULL N20034 A1 33.595 33.665 0.07 DN 23 N13046 NULL A1 33.595 33.665 0.07 DN 23 NULL N2001516 A1 33.595 33.665 0.07 DN 23 NULL N20034 A1 33.665 33.785 0.12 DB 24 NULL N2001516 A1 33.785 33.79 0.005 YS 25 NULL NULL A1 33.79 33.83 0.04 BM 26 NULL NULL 

SORTIE DÉSIRÉE:

 ID START FINISH DURA COD INT PSSID1 PSSID2 CSSID1 CSSID2 CSSID3 A1 33.18 33.27 0.09 ST 15 N13046 N13045 NULL NULL NULL A1 33.27 33.285 0.015 DU 16 N13046 NULL N20015 N2001516 N20033 A1 33.285 33.35 0.065 BM 17 N13046 NULL N20015 N2001516 N20033 A1 33.35 33.395 0.045 DM 18 N13046 NULL N20015 N2001516 N20033 A1 33.395 33.44 0.045 DN 19 N13046 NULL N20015 N2001516 N20033 A1 33.44 33.485 0.045 BM 20 N13046 NULL N20034 N2001516 N20033 A1 33.485 33.51 0.025 DN 21 N13046 NULL N20034 N2001516 N20033 A1 33.51 33.595 0.085 DB 22 N13046 NULL N20034 N2001516 NULL A1 33.595 33.665 0.07 DN 23 N13046 NULL N20034 N2001516 NULL A1 33.665 33.785 0.12 DB 24 NULL NULL NULL N2001516 NULL A1 33.785 33.79 0.005 YS 25 NULL NULL NULL NULL NULL A1 33.79 33.83 0.04 BM 26 NULL NULL NULL NULL NULL 

Pour ne rien arranger, il ne s'agit que d'une petite fraction de données d'échantillon, il peut y avoir plus de trois champs PSSID, CSSID pour un intervalle donné (bien que cela doive avoir une limite supérieure de 5). La requête doit donc être dynamic pour permettre cela.

J'utilise SQL Server 2012. Le schéma pour datatables ci-dessus est fourni ci-dessous:

 CREATE TABLE #SampleData ([ID] varchar(2), [START] decimal(9,2), [FINISH] decimal(9,2), [DURA] decimal(9,2), [COD] varchar(2), [INT] int, [PSSID] varchar(6), [CSSID] varchar(8)) ; INSERT INTO #SampleData ([ID], [START], [FINISH], [DURA], [COD], [INT], [PSSID], [CSSID]) VALUES ('A1', 33.18, 33.27, 0.09, 'ST', 15, 'N13045', NULL), ('A1', 33.18, 33.27, 0.09, 'ST', 15, 'N13046', NULL), ('A1', 33.27, 33.285, 0.015, 'DU', 16, 'N13046', NULL), ('A1', 33.27, 33.285, 0.015, 'DU', 16, NULL, 'N20015'), ('A1', 33.27, 33.285, 0.015, 'DU', 16, NULL, 'N2001516'), ('A1', 33.27, 33.285, 0.015, 'DU', 16, NULL, 'N20033'), ('A1', 33.285, 33.35, 0.065, 'BM', 17, 'N13046', NULL), ('A1', 33.285, 33.35, 0.065, 'BM', 17, NULL, 'N20015'), ('A1', 33.285, 33.35, 0.065, 'BM', 17, NULL, 'N2001516'), ('A1', 33.285, 33.35, 0.065, 'BM', 17, NULL, 'N20033'), ('A1', 33.35, 33.395, 0.045, 'DM', 18, 'N13046', NULL), ('A1', 33.35, 33.395, 0.045, 'DM', 18, NULL, 'N20015'), ('A1', 33.35, 33.395, 0.045, 'DM', 18, NULL, 'N2001516'), ('A1', 33.35, 33.395, 0.045, 'DM', 18, NULL, 'N20033'), ('A1', 33.395, 33.44, 0.045, 'DN', 19, 'N13046', NULL), ('A1', 33.395, 33.44, 0.045, 'DN', 19, NULL, 'N20015'), ('A1', 33.395, 33.44, 0.045, 'DN', 19, NULL, 'N2001516'), ('A1', 33.395, 33.44, 0.045, 'DN', 19, NULL, 'N20033'), ('A1', 33.44, 33.485, 0.045, 'BM', 20, 'N13046', NULL), ('A1', 33.44, 33.485, 0.045, 'BM', 20, NULL, 'N2001516'), ('A1', 33.44, 33.485, 0.045, 'BM', 20, NULL, 'N20033'), ('A1', 33.44, 33.485, 0.045, 'BM', 20, NULL, 'N20034'), ('A1', 33.485, 33.51, 0.025, 'DN', 21, 'N13046', NULL), ('A1', 33.485, 33.51, 0.025, 'DN', 21, NULL, 'N2001516'), ('A1', 33.485, 33.51, 0.025, 'DN', 21, NULL, 'N20033'), ('A1', 33.485, 33.51, 0.025, 'DN', 21, NULL, 'N20034'), ('A1', 33.51, 33.595, 0.085, 'DB', 22, 'N13046', NULL), ('A1', 33.51, 33.595, 0.085, 'DB', 22, NULL, 'N2001516'), ('A1', 33.51, 33.595, 0.085, 'DB', 22, NULL, 'N20034'), ('A1', 33.595, 33.665, 0.07, 'DN', 23, 'N13046', NULL), ('A1', 33.595, 33.665, 0.07, 'DN', 23, NULL, 'N2001516'), ('A1', 33.595, 33.665, 0.07, 'DN', 23, NULL, 'N20034'), ('A1', 33.665, 33.785, 0.12, 'DB', 24, NULL, 'N2001516'), ('A1', 33.785, 33.79, 0.005, 'YS', 25, NULL, NULL), ('A1', 33.79, 33.83, 0.04, 'BM', 26, NULL, NULL) ; 

Appréciez toute votre aide!

Vous avez déjà défini les groupes créant la colonne INT . Nous pouvons l'utiliser, pour faire pivot à la fois PSS et CSS séparément et ensuite les join.

 SELECT * INTO #DataSourcePSS FROM ( SELECT [INT] ,[PSSID] ,CONCAT('PSSID',ROW_NUMBER() OVER (PARTITION BY [INT] ORDER BY [PSSID] DESC)) AS [RowID] FROM #SampleData ) DS PIVOT ( MAX([PSSID]) FOR RowID IN ([PSSID1], [PSSID2], [PSSID3], [PSSID4], [PSSID5]) ) PVT SELECT * INTO #DataSourceCSS FROM ( SELECT [INT] ,[CSSID] ,CONCAT('CSSID', ROW_NUMBER() OVER (PARTITION BY [INT] ORDER BY [CSSID] DESC)) AS [RowID] FROM #SampleData ) DS PIVOT ( MAX([CSSID]) FOR RowID IN ([CSSID1], [CSSID2], [CSSID3], [CSSID4], [CSSID5]) ) PVT; WITH DataSourceSD AS ( SELECT DISTINCT [ID], [START], [FINISH], [DURA], [COD], [INT] FROM #SampleData ) SELECT SD.* ,PSS.[PSSID1],PSS.[PSSID2],PSS.[PSSID3],PSS.[PSSID4],PSS.[PSSID5] ,CSS.[CSSID1],CSS.[CSSID2],CSS.[CSSID3],CSS.[CSSID4],CSS.[CSSID5] FROM DataSourceSD SD INNER JOIN #DataSourcePSS PSS ON SD.[INT] = PSS.[INT] INNER JOIN #DataSourceCSS CSS ON SD.[INT] = CSS.[INT] ORDER BY SD.[INT]; DROP TABLE #DataSourceCSS; DROP TABLE #DataSourcePSS; DROP TABLE #SampleData; 

entrez la description de l'image ici

Comme vous pouvez avoir jusqu'à cinq valeurs dans chaque groupe, je pivoterai sur cinq valeurs. Dans ce cas, vous pouvez avoir des colonnes qui n'ont pas de valeurs. Si ce n'est pas OK vous devez utiliser PIVOT dynamic à la place.

Essayez comme ça

MISE À JOUR selon votre commentaire

Ceci est maintenant sortingé en fonction de la numérotation interne des valeurs distinctes de vos PSSID et CSSID. Autre que votre sortie désirée, je reçois 4 valeurs comme le N20034 est dans une nouvelle colonne. Je ne vois pas de logique comment décider dans quelle colonne une valeur devrait apparaître … La key pour cela est la numérotation. Ma première approche numérote les valeurs en fonction de leur position par rapport au "parent", cette nouvelle approche les sortinge pour se placer dans une colonne distincte chacune …

 WITH Numbered AS ( SELECT * ,CASE WHEN PSSID IS NOT NULL THEN 'PSSID' ELSE 'CSSID' END AS ColumnName ,ROW_NUMBER() OVER(PARTITION BY ID,Start,Finish,Dura,COD,[INT],CASE WHEN PSSID IS NOT NULL THEN 'PSSID' ELSE 'CSSID' END ORDER BY (SELECT NULL)) AS SortNr FROM #SampleData ) ,DistinctPSSIDS AS ( SELECT DISTINCT DENSE_RANK() OVER(ORDER BY PSSID) AS SortNr ,PSSID FROM #SampleData WHERE PSSID IS NOT NULL ) ,DistinctCSSIDS AS ( SELECT DISTINCT DENSE_RANK() OVER(ORDER BY CSSID) AS SortNr ,CSSID FROM #SampleData WHERE CSSID IS NOT NULL ) SELECT ID,Start,Finish,Dura,COD,[INT] ,MAX(CASE WHEN n.ColumnName='PSSID' AND dp.SortNr=1 THEN n.PSSID END) AS PSSID1 ,MAX(CASE WHEN n.ColumnName='PSSID' AND dp.SortNr=2 THEN n.PSSID END) AS PSSID2 ,MAX(CASE WHEN n.ColumnName='PSSID' AND dp.SortNr=3 THEN n.PSSID END) AS PSSID3 ,MAX(CASE WHEN n.ColumnName='PSSID' AND dp.SortNr=4 THEN n.PSSID END) AS PSSID4 ,MAX(CASE WHEN n.ColumnName='PSSID' AND dp.SortNr=5 THEN n.PSSID END) AS PSSID5 ,MAX(CASE WHEN n.ColumnName='CSSID' AND dc.SortNr=1 THEN n.CSSID END) AS CSSID1 ,MAX(CASE WHEN n.ColumnName='CSSID' AND dc.SortNr=2 THEN n.CSSID END) AS CSSID2 ,MAX(CASE WHEN n.ColumnName='CSSID' AND dc.SortNr=3 THEN n.CSSID END) AS CSSID3 ,MAX(CASE WHEN n.ColumnName='CSSID' AND dc.SortNr=4 THEN n.CSSID END) AS CSSID4 ,MAX(CASE WHEN n.ColumnName='CSSID' AND dc.SortNr=5 THEN n.CSSID END) AS CSSID5 FROM Numbered AS n LEFT JOIN DistinctPSSIDS AS dp ON dp.PSSID=n.PSSID LEFT JOIN DistinctCSSIDS AS dc ON dc.CSSID=n.CSSID GROUP BY ID,Start,Finish,Dura,COD,[INT] 

Le résultat

 +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | ID | Start | Finish | Dura | COD | INT | PSSID1 | PSSID2 | PSSID3 | PSSID4 | PSSID5 | CSSID1 | CSSID2 | CSSID3 | CSSID4 | CSSID5 | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.18 | 33.27 | 0.09 | ST | 15 | N13045 | N13046 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.27 | 33.29 | 0.02 | DU | 16 | NULL | N13046 | NULL | NULL | NULL | N20015 | N2001516 | N20033 | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.29 | 33.35 | 0.07 | BM | 17 | NULL | N13046 | NULL | NULL | NULL | N20015 | N2001516 | N20033 | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.35 | 33.40 | 0.05 | DM | 18 | NULL | N13046 | NULL | NULL | NULL | N20015 | N2001516 | N20033 | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.40 | 33.44 | 0.05 | DN | 19 | NULL | N13046 | NULL | NULL | NULL | N20015 | N2001516 | N20033 | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.44 | 33.49 | 0.05 | BM | 20 | NULL | N13046 | NULL | NULL | NULL | NULL | N2001516 | N20033 | N20034 | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.49 | 33.51 | 0.03 | DN | 21 | NULL | N13046 | NULL | NULL | NULL | NULL | N2001516 | N20033 | N20034 | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.51 | 33.60 | 0.09 | DB | 22 | NULL | N13046 | NULL | NULL | NULL | NULL | N2001516 | NULL | N20034 | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.60 | 33.67 | 0.07 | DN | 23 | NULL | N13046 | NULL | NULL | NULL | NULL | N2001516 | NULL | N20034 | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.67 | 33.79 | 0.12 | DB | 24 | NULL | NULL | NULL | NULL | NULL | NULL | N2001516 | NULL | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.79 | 33.79 | 0.01 | YS | 25 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ | A1 | 33.79 | 33.83 | 0.04 | BM | 26 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+--------+----------+--------+--------+--------+ 

précédent: old-fashion-pivot avec GROUP BY et agrégats

 WITH Numbered AS ( SELECT * ,CASE WHEN PSSID IS NOT NULL THEN 'PSSID' ELSE 'CSSID' END AS ColumnName ,ROW_NUMBER() OVER(PARTITION BY ID,Start,Finish,Dura,COD,[INT],CASE WHEN PSSID IS NOT NULL THEN 'PSSID' ELSE 'CSSID' END ORDER BY (SELECT NULL)) AS SortNr FROM #SampleData ) SELECT ID,Start,Finish,Dura,COD,[INT] ,MAX(CASE WHEN ColumnName='PSSID' AND SortNr=1 THEN PSSID END) AS PSSID1 ,MAX(CASE WHEN ColumnName='PSSID' AND SortNr=2 THEN PSSID END) AS PSSID2 ,MAX(CASE WHEN ColumnName='PSSID' AND SortNr=3 THEN PSSID END) AS PSSID3 ,MAX(CASE WHEN ColumnName='PSSID' AND SortNr=4 THEN PSSID END) AS PSSID4 ,MAX(CASE WHEN ColumnName='PSSID' AND SortNr=5 THEN PSSID END) AS PSSID5 ,MAX(CASE WHEN ColumnName='CSSID' AND SortNr=1 THEN CSSID END) AS CSSID1 ,MAX(CASE WHEN ColumnName='CSSID' AND SortNr=2 THEN CSSID END) AS CSSID2 ,MAX(CASE WHEN ColumnName='CSSID' AND SortNr=3 THEN CSSID END) AS CSSID3 ,MAX(CASE WHEN ColumnName='CSSID' AND SortNr=4 THEN CSSID END) AS CSSID4 ,MAX(CASE WHEN ColumnName='CSSID' AND SortNr=5 THEN CSSID END) AS CSSID5 FROM Numbered GROUP BY ID,Start,Finish,Dura,COD,[INT] 

Le résultat

 +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | ID | Start | Finish | Dura | COD | INT | PSSID1 | PSSID2 | PSSID3 | PSSID4 | PSSID5 | CSSID1 | CSSID2 | CSSID3 | CSSID4 | CSSID5 | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.18 | 33.27 | 0.09 | ST | 15 | N13045 | N13046 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.27 | 33.29 | 0.02 | DU | 16 | N13046 | NULL | NULL | NULL | NULL | N20015 | N2001516 | N20033 | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.29 | 33.35 | 0.07 | BM | 17 | N13046 | NULL | NULL | NULL | NULL | N20015 | N2001516 | N20033 | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.35 | 33.40 | 0.05 | DM | 18 | N13046 | NULL | NULL | NULL | NULL | N20015 | N2001516 | N20033 | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.40 | 33.44 | 0.05 | DN | 19 | N13046 | NULL | NULL | NULL | NULL | N20015 | N2001516 | N20033 | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.44 | 33.49 | 0.05 | BM | 20 | N13046 | NULL | NULL | NULL | NULL | N2001516 | N20033 | N20034 | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.49 | 33.51 | 0.03 | DN | 21 | N13046 | NULL | NULL | NULL | NULL | N2001516 | N20033 | N20034 | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.51 | 33.60 | 0.09 | DB | 22 | N13046 | NULL | NULL | NULL | NULL | N2001516 | N20034 | NULL | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.60 | 33.67 | 0.07 | DN | 23 | N13046 | NULL | NULL | NULL | NULL | N2001516 | N20034 | NULL | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.67 | 33.79 | 0.12 | DB | 24 | NULL | NULL | NULL | NULL | NULL | N2001516 | NULL | NULL | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.79 | 33.79 | 0.01 | YS | 25 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+ | A1 | 33.79 | 33.83 | 0.04 | BM | 26 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | +----+-------+--------+------+-----+-----+--------+--------+--------+--------+--------+----------+----------+--------+--------+--------+