my c# application retrieves on million records sql server, processes them , updates database back. results in close 100,000 update statements , have following form -
update table1 set col1 = <some number> id in (n1, n2, n3....upto n200)
"id" int, primary key clustered index. no 2 update statements update same ids, in theory, can run in parallel without locks. therefore, ideally, suppose should run many possible in parallel. expectation finish in no more 5 minutes.
now, question efficient way of doing it? i'm trying below -
- running them sequentially 1 one - least efficient solution. takes on hour.
- running them in parallel launching each update in it's on thread - again inefficient because we're creating thousands of threads tried anyway , took on hour , quite few of them failed because of or connection issue.
- bulk inserting in new table , join update. run concurrency issues because more 1 user expected doing it.
- merge batches instead of updates - google says merge slower individual update statements haven't tried it.
i suppose must common problem many applications out there handle sizeable amounts of data. there standard solutions? ideas or suggestions appreciated.
i created integer tbl type can pass id's sp list , single query update whole table.
this still slow see way more quicker conventional "where id in (1,2,3)"
definition type
create type [dbo].[integer_list_tbltype] table( [n] [int] not null, primary key clustered ( [n] asc )with (ignore_dup_key = off) ) go
here usage.
declare @intval integer_list_tbltype declare @colval int=10 update c set c.col1=@colval @intval join table1 c on c.id = i.n
let me know if have questions.
Comments
Post a Comment