Hi All,
I need to return query results of a table with one of the column values updated to that of another row's column value some of the time. What I can't seem to do with LAG or LEAD is use that updated value to determine if a third row's column value should be updated, which is why I think a recursive CTE sounds appropriate.
We have a list of patients hospital stays, with some patients having multiple rows. If a patient's second hospital stay started less than 30 days after their first hospital stay, we want the discharge date of the second hospital stay updated to use the discharge date of the first hospital stay. We want to continue this compare-and-update process for every row (other than the first row) for all of the remaining records for that patient. For example, here's my table:
CREATE TABLE #PatientStays ( PatientID INT , AdmitDate DATE , DischargeDate DATE);
INSERT INTO #PatientStays VALUES
(1, '2020-01-01', '2020-01-05') ,
(1, '2020-01-28', '2020-02-02') ,
(1, '2020-03-15', '2020-03-16') ,
(1, '2020-04-05', '2020-04-11') ,
(2, '2020-01-18', '2020-01-25') ,
(2, '2020-03-01', '2020-03-09') ,
(2, '2020-04-04', '2020-04-06');
And here's how we want the results returned:
SELECT 1 AS PatientID, '2020-01-01' AS AdmitDate, '2020-01-05' AS DischargeDate
UNION ALL SELECT 1, '2020-01-28', '2020-01-05' UNION ALL
SELECT 1, '2020-03-15', '2020-03-16' UNION ALL
SELECT 1, '2020-04-05', '2020-03-16' UNION ALL
SELECT 2, '2020-01-18', '2020-01-25' UNION ALL
SELECT 2, '2020-03-01', '2020-03-09' UNION ALL
SELECT 2, '2020-04-04', '2020-03-09'
Thanks for any help you can provide. I've done some reading on recursive CTE's, but everything I find seems to be more complicated than what I'm trying to do and I get tangled up.
Thanks again,
Do you have any updates?
Please remember to accept the answers if they helped. Your action would be helpful to other users who encounter the same issue and read this thread.
;With cteRN As
(Select PatientID, AdmitDate, DischargeDate,
Row_Number() Over (Partition By PatientID Order By AdmitDate) As RN
From #PatientStays),
cteDates As
(Select PatientID, AdmitDate, DischargeDate, RN,
DischargeDate As NewDischargeDate
From cteRN
Where RN = 1
Union All
Select d.PatientID, r.AdmitDate, r.DischargeDate, r.RN,
Case When DateDiff(day, d.DischargeDate, r.AdmitDate) < 30 Then d.DischargeDate Else r.DischargeDate End As NewDischargeDate
From cteDates d
Inner Join cteRN r On d.PatientID = r.PatientID And d.RN + 1 = r.RN)
Update s
Set s.DischargeDate = d.NewDischargeDate
From #PatientStays s
Inner Join cteDates d On s.PatientID = d.PatientID And s.AdmitDate = d.AdmitDate;
-- Check Result
Select PatientID, AdmitDate, DischargeDate
From #PatientStays
Order By PatientID, AdmitDate;
I think your last row of the output is not correct since difference between '2020-03-01' and '2020-04-04' is bigger than 30 days. So it should not be updated. Try this:
WITH CTE AS (
SELECT *,
LAG(AdmitDate, 1) OVER (PARTITION BY PatientID ORDER BY AdmitDate) AS PreviousAdmitDate,
LAG(DischargeDate, 1) OVER (PARTITION BY PatientID ORDER BY DischargeDate) AS PreviousDischargeDate
FROM #PatientStays
UPDATE t
SET t.DischargeDate = CASE WHEN DATEDIFF(DAY, c.PreviousAdmitDate, c.AdmitDate) < 30 THEN c.PreviousDischargeDate ELSE t.DischargeDate END
FROM #PatientStays AS t
INNER JOIN CTE AS c ON c.PatientID = t.PatientID AND c.AdmitDate = t.AdmitDate AND c.DischargeDate = t.DischargeDate
SELECT * FROM #PatientStays
In order to handle chains of DischargeDate's (like when you added the new row with an AdmitDate of 2020-04-12, use the following (the only change from my first answer is to use "Then d.NewDischargeDate" instead of "Then d.DischargeDate" in the CASE statement.
;With cteRN As
(Select PatientID, AdmitDate, DischargeDate,
Row_Number() Over (Partition By PatientID Order By AdmitDate) As RN
From #PatientStays),
cteDates As
(Select PatientID, AdmitDate, DischargeDate, RN,
DischargeDate As NewDischargeDate
From cteRN
Where RN = 1
Union All
Select d.PatientID, r.AdmitDate, r.DischargeDate, r.RN,
Case When DateDiff(day, d.DischargeDate, r.AdmitDate) < 30 Then d.NewDischargeDate Else r.DischargeDate End As NewDischargeDate
From cteDates d
Inner Join cteRN r On d.PatientID = r.PatientID And d.RN + 1 = r.RN)
Update s
Set s.DischargeDate = d.NewDischargeDate
From #PatientStays s
Inner Join cteDates d On s.PatientID = d.PatientID And s.AdmitDate = d.AdmitDate;
-- Check Result
Select PatientID, AdmitDate, DischargeDate
From #PatientStays
Order By PatientID, AdmitDate;
drop table if exists #PatientStays
CREATE TABLE #PatientStays ( PatientID INT , AdmitDate DATE , DischargeDate DATE);
INSERT INTO #PatientStays VALUES
(1, '2020-01-01', '2020-01-05'),
(1, '2020-01-28', '2020-02-02'),
(1, '2020-03-15', '2020-03-16'),
(1, '2020-04-05', '2020-04-11'),
(1, '2020-04-12', '2020-04-13'),
(2, '2020-01-18', '2020-01-25'),
(2, '2020-03-01', '2020-03-09'),
(2, '2020-04-04', '2020-04-06')
select *
from #PatientStays
order by PatientID, AdmitDate
with Q1 as
select *, lag(DischargeDate) over (partition by PatientId order by AdmitDate) as pdd
from #PatientStays
Q2 as
select *
from Q1
where pdd is null or DATEDIFF(d, pdd, AdmitDate) > 30
Q3 as
select ps.*, Q2.DischargeDate as pdd
from #PatientStays ps
inner join Q2 on Q2.PatientID = ps.PatientID and DATEDIFF(d, Q2.DischargeDate, ps.AdmitDate) between 1 and 30
update Q3
set DischargeDate = pdd
select * from #PatientStays
order by PatientID, AdmitDate
PatientID AdmitDate DischargeDate
----------- ---------- -------------
1 2020-01-01 2020-01-05
1 2020-01-28 2020-01-05
1 2020-03-15 2020-03-16
1 2020-04-05 2020-03-16
1 2020-04-12 2020-03-16
2 2020-01-18 2020-01-25
2 2020-03-01 2020-03-09
2 2020-04-04 2020-03-09
That works for that set of data, but will not work if the chain of admissions is longer than 3. For example, if you added one more datapoint for PatientID 1 with AdmitDate of 2020-04-17 and DischargeDate 0f 2020-04-20.
That's the reason for using recursive CTE's. They can handle chains of any length. Of course, if any chain is longer than 10, you need to use the MAXRECURRSION option.
Hi @EricBragas-0496 ,
Recursive cte does not seem to solve your problem.
I have provided another method, the first cte uses the row-number function to mark the row number, and the second cte uses the join and row number to realize the self-joining of the table, so that the data to be compared is placed in the same row. Then you can directly compare the second hospital stay with the first hospital stay, whether the interval exceeds 30 days to update the corresponding data.
Please check:
;with cte
as (select *, row_number() over (partition by PatientID order by AdmitDate) rn
from #PatientStays)
,cte2
as (select c1.PatientID,c1.AdmitDate,c1.DischargeDate,
c2.PatientID PatientID2,c2.AdmitDate AdmitDate2,c2.DischargeDate DischargeDate2
from cte c1
left join cte c2 on c1.PatientID = c2.PatientID AND c1.rn-1 = c2.rn )
update #PatientStays
set DischargeDate = case when datediff(day,c.AdmitDate2,c.AdmitDate)<30
then c.DischargeDate2 else #PatientStays.DischargeDate end
from cte2 c
where #PatientStays.PatientID=c.PatientID AND #PatientStays.AdmitDate = c.AdmitDate
AND #PatientStays.DischargeDate = c.DischargeDate
select * from #PatientStays
If you have any question, please feel free to let me know.
If the response is helpful, please click "Accept Answer" and upvote it.
Regards
If the answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.