Répartition des fréquences par colonne dans SQL Server

J'ai une table avec la structure suivante

year age score weight ------------------------------ 2008 16 100 3 2008 25 150 2 2009 40 210 2 2009 22 50 3 2009 65 90 3

J'ai besoin de find la dissortingbution de fréquence du score par âge.

La sortie attendue est la dissortingbution du score par âge en%: (Sortie non précise – pour la représentation uniquement, la ligne totalise 1)

  0-50 51-100 101-150 151-200 201-250 2008 16-35 0.3 0.2 0.1 0.4 0 2008 36-40 2008 41-65 2008 66+ 2008 All 2009 16-35 2009 36-40 2009 41-65 2009 66+

Je suis venu avec la requête ci-dessous dire pour un groupe

 select yearofsale, sum(sum(case when age between 16 and 35 AND score between 0 and 50 then 1 else 0 end) * weight) / sum(case when age between 16 and 35 and score between 0 and 50 then weight else 1 end) from table group by yearofsale

mais je suis à peu près sûr qu'il existe un moyen plus simple de le faire. Une idée?

Merci, abeille

Vous pouvez d'abord effectuer une sous-requête qui ajoute les informations sur le groupe d'âge et le groupe de scores à chaque logging.

Ensuite, vous pouvez append à cela le poids total par année et par groupe d'âge.

Après cela, vous feriez le calcul du pourcentage par âge et groupe de score. Et finalement vous faites pivoter ce résultat dans la table croisée:

 select yearofsale, age_group, [0-50], [51-100], [101-150], [151-200], [201-250] from ( select yearofsale, age_group, score_group, 100.0*sum(weight) / min(total_weight) as pct from ( select *, sum(weight) over (partition by yearofsale, age_group) as total_weight from ( select *, case when age >= 66 then '66+' when age >= 41 then '41-65' when age >= 36 then '36-40' when age >= 16 then '16-35' end as age_group, case when score < 51 then '0-50' when score < 101 then '51-100' when score < 151 then '101-150' when score < 201 then '151-200' when score < 251 then '201-250' end as score_group from table) as base ) as base2 group by yearofsale, age_group, score_group) as base3 pivot ( sum(pct) for score_group in ([0-50], [51-100], [101-150], [151-200], [201-250]) ) as pivotTable

La sortie pour datatables d'échantillon dans la question d'origine est:

  yearofsale | age_group | 0-50 | 51-100 | 101-150 | 151-200 | 201-250 ------------+-----------+--------+--------+---------+---------+--------- 2008 | 16-35 | NULL | 60.00 | 40.00 | NULL | NULL 2009 | 16-35 | 100.00 | NULL | NULL | NULL | NULL 2009 | 36-40 | NULL | NULL | NULL | NULL | 100.00 2009 | 41-65 | NULL | 100.00 | NULL | NULL | NULL

Voici une approche dynamic où vous obtiendrez la masortingce complète .

Les niveaux masortingciels sont stockés dans une table et, en ajustant les parameters, vous pouvez modifier l'axe et même la source.

Considérer ce qui suit:

 Declare @Source varchar(150)= 'YourTable' Declare @KeyCol varchar(150)= 'Year' Declare @YTier varchar(50) = 'Age' Declare @YMeas varchar(50) = 'Age' Declare @XTier varchar(50) = 'Score' Declare @XMeas varchar(50) = 'Score' Declare @SQL varchar(max) = ' ;with cte1 as ( Select '+@KeyCol+' ,YSeq = max(Y.Seq) ,YTitle = max(Y.Title) ,XSeq = max(X.Seq) ,XTitle = max(X.Title) ,Value = sum(Weight) From '+@Source+' A Join Tier Y on (Y.Tier='''+@YTier+''' and A.'+@YMeas+' between Y.R1 and Y.R2) Join Tier X on (X.Tier='''+@XTier+''' and A.'+@XMeas+' between X.R1 and X.R2) Group By '+@KeyCol+',Y.Seq,X.Seq Union All Select '+@KeyCol+' ,YSeq = Y.Seq ,YTitle = Y.Title ,XSeq = X.Seq ,XTitle = X.Title ,Value = 0 From (Select Distinct '+@KeyCol+' from '+@Source+') A Cross Join (Select Distinct Seq,Title From Tier where Tier='''+@YTier+''') Y Cross Join (Select Distinct Seq,Title From Tier where Tier='''+@XTier+''') X ) , cte2 as (Select '+@KeyCol+',YSeq,RowTotal=sum(Value) from cte1 Group By '+@KeyCol+',YSeq) , cte3 as (Select A.* ,PctRow = Format(case when B.RowTotal=0 then 0 else (A.Value*100.0)/B.RowTotal end,''#0.0'') From cte1 A Join cte2 B on A.'+@KeyCol+'=B.'+@KeyCol+' and A.YSeq=B.YSeq ) Select * Into #Temp From cte3 Declare @SQL2 varchar(max) = Stuff((Select '','' + QuoteName(Title) From Tier where Tier='''+@XTier+''' Order by Seq For XML Path('''')),1,1,'''') Select @SQL2 = '' Select ['+@KeyCol+'],[YTitle] as '+@YTier+','' + @SQL2 + '' From (Select '+@KeyCol+',YSeq,YTitle,XTitle,PctRow=max(PctRow) from #Temp Group BY '+@KeyCol+',YSeq,YTitle,XTitle) A Pivot (max(PctRow) For [XTitle] in ('' + @SQL2 + '') ) p'' Exec(@SQL2); ' Exec(@SQL)

Résultats

entrez la description de l'image ici

Les niveaux sont stockés dans une structure générale. Cela permet plusieurs versions. La table de niveau ressemble à ceci:

entrez la description de l'image ici