I don't know if this work for you, but I sometimes use dictionary to store picked UID's. That way the for each loops doesn't have to loop through a whole family, but only the ones stored in the dictionary. That might work when you have many units in the same family, but let's say 20 out of 100 selected, as you don't have to loop through the 80 that are not selected. I found it to have some performance benefit in some cases.
For me the picking itself seems to use a lot of cpu in some cases, so i try to find ways to use some more lightweight way of picking objects. Restructuring the condition order can help a lot too.
For example:
For each unit (loops through all units)
is Selected. (then picks and filters the ones selected)
insead use.
Is selected (pick the selected ones)
For each unit. (loops through only the selected ones)
Small things like that can help a lot, but you probably know that already. Trying to filter down with conditions as much as possible before running any for each loops and actions, seems to work pretty well in most cases.