Buffer size depends on a few factors, mostly the hardware you're running on, the host, and what other plugins you have going. For the old algorithm, I typically did everything with 256 and that was 15 years ago. so modern machines should be better than that, but at the same time hosts and new plugins tend to pile in features over the years to counterbalance that so it's hard to say.
The problem with that algorithm, and I suspect most of them, is that the analysis it does requires several blocks of data to think about. This was around 4096 samples. That isn't buffer size, it's latency lag, you could start pumping in audio data as soon as you pushed a button, but nothing would come out of it until 16 blocks later (or around 80 milliseconds at 44,1k, I think, math is hard). Latency I can compensate for by "playing ahead" a little to make up for the lag, but sometimes there will be a little gap when you switch inputs to the algorithm, or change the pitch after it's been processing for awhile. It's one of the artifacts.
That's for real-time pitch shift where it behaves like the pitch bend wheel on a synth. There is another approach to this that can result in much better sound but takes preparation and is more limited. You can do the pitch shifting using a high-quality algorithm "in the background" and leave the result somewhere that can be picked up and played later. So for example, you record a loop, the pitch shifting worker bee takes that and proceeds to shift that up and down by semitones and a second or two later there are 12 loop variants you can choose from with different pitches. While that was going on, you were soloing or talking to the audience, or something to kill the time. But after a few seconds of wait time, you'll be able to do chromatic shifts with practically no overhead or artifacts. It's not real time at first, but it is afterward, provided all you need are the pre-selected shift intervals. It's an approach that can work for building chord progressions. Record a loop of the I chord, and while that is playing back for the first time it's busy calculating the IV and V, and when that's done you can then start swapping between them.
Something I've seen in another app is kind of a hybrid approach, you can do immediate real-time pitch shift using a low-quality algorithm so you can at least start hearing the result right away, and in the background the high-qualify algorithm is doing it's thing, and when it is done the result gets blended in and replaces what the low-quality algorithm was doing. From the audience's perspective, the pitch changed immediately, it was a little weird and grainy, and then after awhile it started to become clearer.
Whether I try to do that or not remains to be seen. It would be kind of fun, but I'm still thinking that the best thing to offer is tighter integration with high-quality-real-time pitch shifters you can buy. Many people already have these, and if I can make use of them in a seamless way, you can take advantage of that with a lot less work for me. It's more like hosting a plugin with automated control over its parameters. You ask Mobius "up a forth", Mobius tells the plugin "up a fourth", and replaces the loop output with whatever the plugin is doing.