SOULNOTE Chief Designer Kato's
Design Philosophy Series Part 1
Many of you may not know much about SOULNOTE. We would like to briefly introduce SOULNOTE in several parts.
SOULNOTE is a high-end audio brand of CSR Corporation (Kanagawa Prefecture), which was established in 2004 by former Marantz Japan director Norinaga Nakazawa. The company employs about 50 people.
After graduating from the graduate school of Tottori University, I joined NEC in 1989, where I worked in the Audio Engineering Department.
With NEC's withdrawal from the audio business, I was transferred to Nippon Marantz. There, I was in charge of the development of non-feedback power amplifiers for professional use and the PHILIPS LHH series.
In 2005, Nakazawa called me to join CSR, where I was in charge of SOULNOTE design. I was mainly responsible for the design of dc1.0, da1.0, sa1.0, sc1.0, SA710, SC710, etc.
The turning point came when I was appointed as Chief Engineer of SOULNOTE in 2016.
I was entrusted with everything from product planning, electrical design, structural design, sound quality management, and promotion, all in one place. At that time, Nakazawa had only one thing to say to me.
Make SOULNOTE the best audio brand in the world.
Well, "number one in the world" was Nakazawa's favorite phrase, and while it may seem like a wild idea, I had an idea to make it happen. We would properly explain the reasons for the "discrepancy between specifications and sound quality," which is the biggest mystery in the audio world, achieve unprecedented sound quality using methods that other manufacturers would not (or could not) use, and present it to the world. If we could do that, I really thought we could turn the world's perception of value upside down. I understood the reason for the discrepancy between specs and sound quality, and that is the basis of SOULNOTE's current design philosophy. And now I have the environment to make it happen.
All that remains is to do it.
2021. My position as SOULNOTE Chief Engineer will not change after Kazuyoshi Takanashi is appointed President of CSR. At the end of the year, the S-3R, X-3, and Z-3 were selected as the reference system for the listening room of StereoSound magazine, the most prestigious magazine in Japan, as a result of SOULNOTE's unprecedented method and sound quality being highly evaluated.
The year is 2022. For a further leap forward, SOULNOTE will challenge a full-scale entry into Europe, starting with the Munich High End Show in 2022.
Here is a timeline of the models we have developed since I took over as Chief Engineer.
2016 A-1, C-1, E-1, A-0
2017 D-1, A-2, E-2
2018 D-2, D-1N
2019 S-3
2020 S-3ver2, P-3
2021 ZEUS (D-3, Z-3, X-3, RCC-1), S-3 Ref.
I will discuss my design philosophy in the next section.
SOULNOTE Chief Designer Kato's
Design Philosophy Series Part 2
I will write about the design philosophy of SOULNOTE.
The specifications (static specs) referred to here are so-called catalog specs such as distortion rate, frequency response, and signal-to-noise ratio. These are easily quantifiable performances, mainly using sine waves for measurement.
In the audio industry, everyone knows that sound quality cannot be judged by specifications (static specs) alone. Also, anyone who likes audio knows that sound changes depending on cables and racks. No matter how precisely they are measured, even though the specs make no difference!
This would seem odd to anyone but an audiophile. In this age of scientific universalism, it seems impossible that people can sense small differences that cannot be detected by precise measurement with high-end measuring instruments. Human hearing is not that good, and the frequency range is only 20 to 20 KHz at best. (But for sine waves! but)
Well, that is why, even though we knew that sound quality cannot be described by specs alone, I think there was still a part of us that could not go against the specs. In other words, the history of audio is such that no one could refute the opinion that "sound is a matter of taste, so you are free to choose, but there is no doubt that a sound with better specifications is a more correct sound.
For example, suppose you were developing a product, and by working on the circuitry, you improved the specs in some way. And suppose the sound has changed. In that case, most engineers would assume that the sound with the better specs is the "better sound. Furthermore, if a major manufacturer develops a new device and no matter how good the sound is, if the specs are inferior to the previous product, the bosses and sales people will usually not allow the release of the new product. This is especially true if the manufacturer explains the sound quality by the quality of the specs.
Let me tell you an old story.
When I was a student, I loved music, but I had no money, so I built amplifiers and speakers as a hobby. At first I didn't have any proper measuring instruments, but that didn't matter because as long as I could enjoy listening to music, that's all that mattered. For me, it was a proud device that allowed me to enjoy listening to music. It sounded a whole lot better than my friend's high-end amplifier.
However, one day I acquired a measuring device. When I measured it, it was terrible. Then, I wanted to improve the measurement value as much as possible. And as a result of various improvements and better readings, I was very shocked. Listening to music with it is completely boring. Why is this? Since then, I have been thinking about this for 40 years. And then I arrived at a way of thinking.
Imagine for a moment.
What if I could explain to you that specs don't mean much for sound quality? Furthermore, what if you could explain that improving the specifications may even degrade the sound quality? Don't you think it would be like a change in values?
I can explain this. It is not that difficult. It is all because of a certain curse.
To be continued in the next article.
SOULNOTE Chief Designer Kato's
Design Philosophy Series Part 3
In today's age of universal science, Everyone thinks that it is impossible for humans to hear differences that cannot be detected even by the most advanced measuring instruments. But is this really so? In fact, there are many values around us that cannot be easily quantified.
Take, for example, cooking. Suppose we measure the mass of each ingredient with a state-of-the-art measuring instrument and make it exactly the same to the nearest 0.0001g. Even so, if the creators were a world-famous chef and myself, it is only natural that the resulting dish would taste different. The reason is that although the ingredients are exactly the same, the cooking skills are different. But can the cooking skill be quantified? And can the taste be quantified? This is quite difficult. Even today, the only way to evaluate the taste of a dish is to try it.
Take automobiles, for example. If two cars with precisely matched engine power and weight were driven on a circuit by the same driver, would they set the same time? That is not possible. Body rigidity and suspension settings can completely change the time. This is because cornering performance changes. However, there is no section on cornering performance in the car catalog. In other words, you cannot know the performance of a car until you drive it. Even in the cutting-edge F1 where everything is electronic and various simulations are possible, at the end of the day, the only way to tune the car is for the driver to actually drive it.
I gave examples of food and cars, but that has nothing to do with audio! I am sure I will be scolded. That's right. But there are usually values that cannot be expressed in numbers! I am just showing you an example.
Now, sound quality in audio is different from these. This is because not only can't sound quality be measured by specs, but also better specs can make the sound worse.
This will be explained in the next article.
To be continued.
SOULNOTE Chief Designer Kato's
Design Philosophy Series Part 4
In Part 1, I wrote, "I have an idea to achieve a different level of sound quality with a new methodology never seen before. Now it is time to explain this idea. But first, I need to explain the "curse" that is the reason why no one has been able to reach this idea until now.
In the last issue, I wrote about an example that even today, there are some performances that cannot be expressed in numbers. In the same way, in audio, there may be some factors that cannot be expressed in numbers but change the sound. There may also be factors that change the sound with cables. There may be factors that are not yet generally known or overlooked.
Well, even if there are factors that cannot be expressed in numbers, an engineer of an ordinary audio manufacturer would consider the following. "Why don't you just improve the distortion ratio, signal-to-noise ratio, frequency response, and other catalog specs, and then make the sound better!" This has been the conventional wisdom. Especially in the past, catalog spec competition was fierce, and even now in the field of digital audio, spec competition is fierce. Everyone thinks that there is no way that the sound will get worse by improving the values. That is a trap...improving catalog specs can result in a bad sound. And it is not uncommon. In many cases, pursuing catalog specs more than necessary is accompanied by deterioration of sound quality. The reasons for this are described below. It will be a bit long, but please read it. We will reach a conclusion that no one has ever told you before! However, it is not yet a theory that has been proven by official experiments, and my subjective opinion will be involved, especially when it comes to sound quality evaluation. I will be the first to admit that. However, I am confident that the sound quality obtained by this method will resonate with many people.
To begin with, sound is made up of the Amplitude axis and the Time axis, which are the vertical and horizontal axes in a graph. Music sources in audio are also recorded as amplitude (voltage values) per time. This is basically the same for both digital and analog sources. Without the time axis, sound cannot exist. As proof, there is a "still image" in video, but there is no such thing as a "still sound" in sound. You have never heard a still sound, have you?
Sine waves are used to measure catalog specs such as distortion rate, frequency response, and signal-to-noise ratio. The reason is that it is convenient for quantification. A sine wave is a signal of a single frequency that lasts forever. It is a static signal with no dynamic changes. I mentioned that there is no static sound, but a sine wave is close to that. This makes the measured result less likely to reflect a temporal component. I mentioned that sound has two axes, the "amplitude axis" and the "time axis," but the catalog spec is a measurement that almost ignores the "time axis" in order to make it easier to quantify.
We often use FFT (Fast Fourier Transform) analyzers to analyze sound. Simply put, the FFT transforms the time axis into the frequency axis for easier analysis. Assuming that a signal of a certain time width is repeated forever, we decompose it into its frequency components and arrange them. This is called the Fourier transform. The familiar frequency response graph is the result of the Fourier transform itself. In this case, too, the time axis is completely ignored.
In other words, it is a fourier transform that turns the food into a paste in a blender and then separates and arranges it by component in a centrifuge. The chef's skill is ignored.
Somehow we have come to think of sound quality in terms of the Frequency axis. And somehow we forgot about the Time axis. I call this the Fourier's curse.
When I was a child, I used to think that a perfect graphic equalizer would give me the freedom to create any kind of sound quality I wanted. But of course, even if you match the frequency response, the sound quality will not be the same. We try to find the answer in the signal-to-noise ratio or distortion ratio. But that is the curse of Fourier: we are made to forget about the time axis.
It is as if we wonder at the difference in taste between two dishes made with the same ingredients in the same quantity (exactly in the Frequency axis). The cook's skill, for example, the order in which the ingredients are added or the simmering time (Time axis), is not even considered. It is truly a curse.
From this point on, frequency axis performance that can be quantified as catalog specs, such as distortion rate, frequency response, and signal-to-noise ratio, is called static performance.
On the other hand, the performance related to the time axis, which is difficult to quantify, is called Dynamic performance.
Dynamic performance is the lost performance that does not appear in ordinary catalog specs. If we were to mention just a few, rise time, impulse response waveform, clock jitter, etc., are among the Dynamic performance. However, it is difficult to quantify and visualize because it seems to affect the sound only for a very small period of time.
Dynamic performance is like a chef's skill in cooking. In the case of a car, it is the cornering performance. It is interesting to note that the time axis is also a factor in these performances and is difficult to quantify. Humans seem to be good at ignoring time and quantifying it. The only way to determine the essence is to eat or drive. Dynamic performance in audio can also be understood by listening, and it can be said that performance can only be judged by listening.
And there is something even more tricky. Static performance and dynamic performance in audio become a trade-off after a certain level. The reason for this lies in the characteristics of human hearing.
Here is an extreme example of how too much pursuit of static characteristics leads to degradation of dynamic characteristics. I am sorry to use the car analogy again.
A car used in a competition where the competition is based on acceleration in a straight line for 400m is called a drag car. It is much faster than an F1 car in terms of straight line acceleration, but it cannot turn. Let's apply this to audio.
The performance required to listen to music is similar to the performance required to drive a car fast on a circuit. In other words, the performance to trace (reproduce) various circuits (sound sources) faithfully, in other words, dynamic characteristics are important. On the other hand, a straight line is a sine wave in audio. Therefore, the performance that can be measured is exactly the static characteristic. An audio product that places too much emphasis on static characteristics, like a drag car, cannot reproduce music properly.
With audio equipment, it is common to first improve static performance and then tune the sound quality. But is that enough?
This is not the case with cars.
It is impossible to increase cornering speed by tuning after building a car that pursues straight line performance first; the basic design of F1 cannot be considered without cornering performance.
A car that specializes in static performance (straight line performance) cannot drive on a circuit.
SOULNOTE Chief Designer Kato's
Design Philosophy Series Part 5
Finally, I will write about human hearing. It seems to me that the conventional wisdom about hearing is also distorted by the curse of the Fourier's law, which is centered on static performance. When we evaluate sounds, we unconsciously think in terms of frequency axis, such as bass, midrange, treble, and so on. I call such a cursed way of thinking "frequency brain.
It is common knowledge that humans cannot hear above 20 kHz. Of course, I cannot hear it either. However, that is the case with sine waves.
Let me put it this way.
"Humans cannot hear above 20 kHz in the case of a sine wave, but they can sense the slowing of the rise of a musical waveform when the frequency band above 20 kHz is cut off.
In other words, experiments emphasizing Static performance with the frequency brain and experiments emphasizing Dynamic performance with the time axis taken into account have different results. Let me illustrate this with my favorite sushi.
Let's compare two sushi made by a sushi chef and an amateur using the exact same ingredients and rice. The frequency-brain experiment goes like this. The experiment is to crush the sushi in a blender and analyze the ingredients in a centrifuge. The result will be that there is no difference in the ingredients, so the taste is the same, and so on. The result will be that the taste is the same because there is no difference in the ingredients, and that there is no difference in the taste depending on the hand gripper. Of course, I would not be able to taste the difference in the taste of the sushi that has been crushed into mush. I wouldn't even want to eat it.
The sine wave experiment is an experiment that does not take into account the time axis, i.e., the sludge sushi experiment. Why not eat it and compare? That's because it can't be quantified and is subjective. And the component results of sludge sushi are more important. That is what audio is today, raped by the frequency brain. No matter how good the sound is, "if it's not measured right, it doesn't sound right!" Static performance is a universal opinion, which cannot be clearly refuted. It is at such a level. Isn't it ridiculous?
There are any number of events generally recognized in modern audio that contradict the assumption that humans do not perceive anything above 20 kHz. Take, for example, sound image localization. If the equipment is excellent, we can perceive three-dimensional sound image localization with two speakers. I don't believe that! If you are one of those who say, "I'm sorry," there is no need to read any further. It is true that some people do not feel it, but it is also true that some people do. Assuming that "humans cannot hear above 20 kHz, so it is not necessary" is correct, it is impossible to explain the three-dimensional localization of the sound image itself. This is because the phase difference required to produce a finely spread sound image localization, when converted into frequency, far exceeds 20 kHz.
This is also a fairly well-known phenomenon. It is becoming common knowledge that the difference in sound quality with a 10 MHz clock generator is very large in today's audio. This is exactly what we are talking about when it comes to the time axis. As I have said before, sound is made up of only an amplitude axis and a time axis. The reference for the amplitude axis is GND, and the reference for the time axis is the clock signal. The clock signal controls half of the sound. So it is no surprise that it has a significant effect on sound. However, it has no effect on the results of measurements made with the frequency brain. No matter how much jitter (time fluctuation) there is in the clock signal, as long as the period is correct, the time fluctuation is averaged out and makes no difference.
Thinking about clock generators is a chance to break free from the frequency brain. It is a proof that humans can perceive minute behaviors of 10MHz, not 20kHz.
This is an easy experiment. For example, the analog amplifier stage of the D-2 or S-3 is basically flat, but it has a built-in LPF that attenuates by 8 dB at 100 kHz and can be switched between through and through with a switch. The LPF is a simple construction with a mechanical relay to turn the capacitor on and off, and it has no effect on the audible band below 20 kHz. However, anyone can recognize the difference. The LPF has been removed from the S-3 Reference and D-3. We removed the LPF from the S-3 Reference and D-3 because, of course, it is better to have no LPF in terms of sound quality.
The experiment of inserting a ferrite core, which attenuates at 10 MHz or higher, into a line cable or speaker cable is simple. Simply snap it into place and conduct the experiment. If the equipment is excellent, there are few people who do not feel a change in sound, whether good or bad. This proves that humans can sense changes in signal waveforms at 10 MHz. Whether the cause is the reduction of high-frequency noise or the dulling of the signal waveform, the difference can be felt. I believe that a proper blind experiment would yield a useful difference. However, you need good equipment and good testers. It is impossible for someone who has never eaten sushi to do a sushi comparison.
Can you hear above 20 kHz? The various experiments that have been conducted in the past on the subject are full of intrusions, such as super-tweeter experiments that ignore waveform synthesis, and experiments with random people. These are also the work of the frequency brain.
In the next article, I will finally explain how static performance and dynamic performance become a trade-off relationship from a certain level. In other words, why raising Static performance more than necessary degrades Dynamic performance.
SOULNOTE Chief Designer Kato's
Design Philosophy Series Part 6
If it were proven that "raising Static performance too much degrades Dynamic performance," it would be a discovery so significant that it would turn the audio world upside down. However, this has not been proven theoretically at this time. Of course, I am convinced that with proper experimentation, statistically useful results can be obtained. I would like to do it someday. But unfortunately, I don't have the time right now. I am a designer, not a scholar. SOULNOTE is also a global social experiment to prove the theory.
I have always been aware of the fact that "raising static performance too much degrades dynamic performance," and this is an everyday occurrence for me in product development. This is not something special, but something that anyone who can honestly listen to music can clearly understand if audio equipment is to be used as a machine for enjoying music. The wonderful sound that makes music resonate in our hearts, stirs up various emotions, and sometimes brings tears to our eyes, can easily be drowned out by extra methods used to increase static performance. I have experienced this countless times. I will illustrate this with specific examples below. However, please understand that the evaluation of sound quality and musical expression (evaluation of Dynamic performance) is my subjective opinion. The only way to evaluate sushi is to eat it.
Negative feedback circuits are a common way to improve static performance. 99% of audio circuits in the world use negative feedback circuits. I used to design amplifiers with negative feedback circuits, but the deeper the negative feedback is applied, the better the static performance, but the more the music loses its life force and sounds boring. In other words, it is time-worn sushi because the output is perpetually being returned to the input. This seems to be becoming known around the world, and these days there are fewer audio amplifiers with as deep a feedback as in the past.
SOULNOTE's analog stage is a non-NFB circuit that has eliminated negative feedback. Naturally, the static performance is worse, but the sound is fresher, the music is more vibrant, and the heart is more resonant. The amount of feedback is an example of a very easy-to-understand trade-off.
Generally, to improve the signal-to-noise ratio, it is common sense to reduce N (noise), because S (signal) is fixed in nature. However, in designing a phono equalizer, I discovered that the measured value and the audible signal-to-noise ratio become the exact opposite from a certain point. Phono equalizers need to amplify minute signals significantly. When amplifying with two stages of transistors, it is better to amplify as much as possible in the first stage to reduce noise, because reducing the gain of the second stage reduces the amount of noise from the first stage transistor that is amplified in the second stage. This is common sense in transistor circuits. But! The opposite is true as far as music is concerned. Lowering the gain of the first stage improves the freshness of the sound, and conversely, it sounds as if the SN has improved. In fact, when the signal-to-noise ratio is actually measured, the numbers get worse. This was really strange, but now I can explain it.
In other words, by reducing the gain of the first stage, which is the load of the cartridge, the Miller effect is reduced and the high frequency characteristics, or transient response performance (dynamic performance), is improved. In other words
Static performance is kept as low as possible.
Prioritizing dynamic performance made for more enjoyable music reproduction.
E-1 and E-2 are phono equalizers designed in this way. If you turn up the volume of the amplifier to the max without putting a cartridge on the record, you will hear more noise than with any other phono equalizers from other manufacturers. However, if you actually play the record, you will not be bothered by the noise, but rather, the signal-to-noise ratio will be better than any other phono equalizers from other manufacturers, and the music will resonate in your heart. In other words, the sound with good transient response (excellent dynamic performance) will reach the human ear more clearly. In other words, even if the N of the signal-to-noise ratio increases, the S increases (presence increases) even more.
This is a long story and will be continued in the next article.
Next time, we will finally talk about NOS. It will truly be a battle between Static performance and Dynamic performance.
And why is Dynamic performance more sensitive to the human ear? I will also write about my hypothesis. Stay tuned!
SOULNOTE Chief Designer Kato's
Design Philosophy Series Part 7
Previously, I have described an example of how we can predict that too high a static performance will degrade dynamic performance. This time, I would like to go further and write about an example where dynamic performance is more important to the human ear than static performance.
Static performance is performance that can be quantified and catalogued.
Dynamic performance is a time-related performance that cannot be easily quantified and can only be judged by listening.
Among SOULNOTE's accomplishments over the past six years, I think the most epoch-making event was the adoption of the NOS (Non Over-Sampling) mode. Moreover, it was adopted not as a special mode, but as the default mode. The sound quality has been highly evaluated, and the product is selling well in Japan.
Incidentally, the SOULNOTE D-2 has won first place in the DA converter category of Japan's most prestigious StereoSound magazine awards for four consecutive years. The S-3 Reference SACD player was selected as the reference device in StereoSound's listening room. Incidentally, SOULNOTE digital equipment can be switched between NOS mode and FIR (8x oversampling digital filter) mode with a key on the main unit or remote control. We have made it possible to switch between the two so that anyone can do a comparison experiment. (In case of USB input, some models are fixed to NOS mode.)
Now, when NOS mode is selected, Static performance is very poor.
Especially, the distortion (THD+N) value is miserable. Without bandwidth limitation, the distortion at 1 kHz is about 2%. On the other hand, when FIR mode is selected, it is about 0.005%. That's only because the analog stage is a discrete non-feedback amplifier, but still, that's a 400-fold difference in distortion! This result is not surprising since the waveform is staircase-like in the case of NOS, since there is no LPF in the analog stage. The staircase waveform, when Fourier transformed, becomes an "image signal" of 20 kHz or higher, and if analyzed with an FFT analyzer, the signal-to-noise ratio will also appear very poor. (However, I believe that the image signal is a signal to ensure temporal accuracy and not just noise. It is the frequency brain that makes it look like noise.)
Many SOULNOTE users choose the NOS mode. Why? This is because I feel the sound is honestly good. Freshness, sound image localization, and most of all, musical enjoyment! The majority of the respondents rate the NOS mode superior in all aspects. In other words, NOS mode makes it easy for everyone to experience a change in sound that Dynamic Performance may be more important than Static Performance.
Let me explain in more detail.
When observing the output waveform, in the case of FIR mode, it looks beautiful if it is a sine wave. However, with an impulse waveform, an echo is observed. This echo is an artificial waveform created by the digital filter algorithm and helps to make the staircase waveform look smooth. In other words, FIR mode is a mode that specializes in Static performance. In exchange, time-axis precision is lost. This is truly the curse of Fourier.
On the other hand, in NOS mode, the sine wave looks rattling and dirty, but the impulse waveform is very beautiful. In other words, it is a mode that specializes in dynamic performance that is faithful to the time axis. It does nothing. Just foolishly arranging the sampled data. However, many people find this sound good.
In other words, Dynamic performance sounds more natural to the human ear and resonates more with the human mind than Static performance. Why? I will write about this in the next issue.
Impulse response waveform of D-2.
Upper: FIR mode
Lower: NOS mode
SOULNOTE Chief Designer Kato Speaks
Design Philosophy Series - Part 8
I believe that dynamic performance is more sensitive to hearing than static performance because it is an important function for survival that has been imprinted in human DNA since primitive times. I would like to present some examples that suggest this. This is only my hypothesis.
It is said that the eyesight of a hawk in the sky is dozens of times better for moving objects than for stationary ones. Now, not only hawks, but we can easily spot moving objects, can't we? So wave your hand in a crowd to make yourself easier to find, or even Wally is easy to find if he is moving. Things that move are easy to find. Isn't it obvious? No wonder this applies to hearing as well. Crack! This sensitivity to impulsive sounds (dynamic sounds), such as knowing the direction of sound, must have been a necessary function for obtaining food since ancient times.
In primitive times, people were attacked by enemies mainly at night. This was because they could not see well at night. It is believed that hearing was very important to protect oneself from enemies at night. It can be imagined that those who sensed the presence of an enemy's footsteps and direction and fled were the ones who survived. In other words, it is natural that the human ear is sensitive to impulsive (dynamic) sounds. And it is that fast-rising sound that is the dangerous sound that we should listen to intensively. Static sounds, on the other hand, are less dangerous and do not require as much sensitivity.
But that is the case with sine waves. In the case of a continuous sound like a sine wave, there is no danger and no need to increase sensitivity.
Consider the case of cutting the band above 20 kHz because sine waves cannot be heard. This is a common technique used in audio products to improve the signal-to-noise ratio. In this case, the static performance value is better because the noise component is reduced. However, the rise of impulse sound will be slower. In other words, dynamic performance will be reduced. It is natural for humans to perceive such sounds as insignificant and unrealistic. This is because we do not immediately perceive them as dangerous.
If human hearing is sensitive to dynamic sound, then static performance-oriented audio equipment (LPFs) that easily cut high frequencies outside the audible bandwidth is a major problem. Why don't people realize this? Even engineers must feel a loss of reality when listening to sound. However, they are misled by the numbers and think in their minds that the less noise, the better the sound. Or perhaps they are not listening to the music when designing. Either way, engineers assume that as long as the numbers are good, the S/N ratio is good, even if the sound is boring and without reality. This is also the curse of Fourier.
More serious are digital operations such as oversampling digital filters and digital PLLs. Digital arithmetic is not mathematically wrong, and many would consider the evolution of digital arithmetic to be the evolution of digital audio over the past 40 years. I used to think so too until a few years ago. Indeed, mathematically correct. However, that assumption is dubious. Humans cannot hear above 20 kHz, so it doesn't matter what we do. This is the premise of digital audio. It is too reckless.
To me, it seems to fatally undermine reality even more than the LPF. I am sure you will understand. But engineers are so blinded by the excellence of static performance that they don't notice it. This is also Fourier's curse.
As I wrote last time, SOULNOTE's digital instruments allow you to switch between NOS mode and FIR mode for comparison. You will see how digital manipulation can kill the sound. Even if you can't tell the difference, it proves that bypassing the digital filter makes less difference to the sound than you might think. Even though the distortion can be as much as 400 times greater!
Digital sounds great without digital manipulation. It is not digital that is bad. It is the digital operations that ignore the characteristics of human hearing that are bad.
SOULNOTE Chief Designer Kato's
Design Philosophy Series Part 9
Previously, we have discussed Static performance, which is performance that can be expressed by measurements, and Dynamic performance, which is performance related to the time axis that is difficult to express by measurements, with electrical circuit examples such as non-NFB, NOS, and LPF. This time, we will discuss the question that has long plagued audiophiles: "Why does the structure of the enclosure affect the sound?" The following is a discussion of this issue. This is also a change in sound that is not expressed in measurements, and is truly dynamic performance. It is a factor that can only be judged by listening.
SOULNOTE products have mechanical features such as unfixed top panel, unfixed board, unfixed terminal base, and thin and light cable. This is the exact opposite of the heavy, rigid construction that is common in high-end products. Why is this so? In this article, we will discuss why mechanical structure affects sound and the secrets of SOULNOTE's enclosure. This is a new idea that perhaps no one else has mentioned, but of course it is only my hypothesis. It is SOULNOTE and you who will verify it.
We feel that the enclosure has a very strong influence on sound. For example, during development, it is common to consider leaving the top panel open to increase efficiency. However, it is an everyday occurrence that the wonderful sound that has been painstakingly refined in this state is instantly ruined as soon as the top panel is closed. The sense of openness disappears, the sound field that had been spread out three-dimensionally becomes narrower, and the performance becomes cramped. On top of that, the sound becomes hard and tiring to listen to. This should be experienced by any engineer who designs while listening to sound.
I believe the reason for this is the same as the main reason cables change the sound or electrical components change the sound. Vibration. To be more precise, I believe that the frequency characteristics of the vibration of each component is a factor in the sound.
As it has become clear that vibration has a negative impact on sound, various measures have been taken to prevent vibration. For example, anti-vibration measures such as attaching anti-vibration rubber to cables and capacitors and placing weights on them have been implemented. As a result, the sound changes. And then, "The sound is better because of the vibration countermeasures! I think. Isn't this similar to the previous story? "The LPF reduced the noise, so the sound is better!" It is the same as an engineer who thinks
I have rarely felt the sound of rubber or weighted vibration isolation to be good. I feel that the sound atrophies, the echoes disappear, and the sound is often boring and dead.
Of course, vibration is evil, so it might be good if it could be completely eliminated. However, rubber and weights are not enough. The faster the vibration, the stronger the vibration isolation, and the slower the vibration, the less vibration isolation. In other words, the materials used for vibration isolation have their own frequency characteristics, which affect the sound. This is the reason why rubber makes a rubbery sound. This is also the reason why heavier parts for vibration isolation also make the sound heavier. This is because the faster vibration is suppressed, the more strongly it is suppressed, and this effect affects the sound. In fact, it is almost impossible to suppress vibration by weight. Even buildings vibrate. Furthermore, the element of resonance, which will be discussed later, also becomes stronger.
In other words, rather than trying to suppress vibration badly, it is better to make it light and free, so that strange habits do not take over. It is better to leave it free, and if it moves, to be able to move fast and without habit. Lightness also has the advantage of faster convergence, and the fact that SOULNOTE's cables are thin and light is the result of a choice made based on sound, which is consistent with my hypothesis about vibration.
In the next article, I will discuss another important element of vibration: resonance. I will also discuss my discovery of "invisible anti-vibration rubber that kills sound. Stay tuned!
M-3 heat sink
The ends are lightweight and not mechanically constrained.
SOULNOTE Chief Designer Kato's
Design Philosophy Series Part 10
In the previous issue, I wrote about the possibility that vibration isolation by rubber or weights may be putting habits on the sound. For simplicity, I wrote about the frequency response of the vibration isolation material, but to be precise, it is the delay on the time axis. I feel it is reasonable to think that the overlap of delayed signals blurs the rise of sound and reduces the sensitivity of human hearing. Thick, heavy cables enhance bass! Many people say. That is only because the rise of the sound is blurred and only the low frequencies are noticeable in comparison. Isn't it much more unscientific to think that cables amplify sound?
SOULNOTE's RSC series speaker cables are single wires with foamed Teflon coating. It is very thin, lightweight, and resistant to being damped. This is even more effective when it is floated in the air with a single point grounding cable insulator.
It is air. When sealed, air is a hard and viscous substance, like rubber. Air suspension supports cars and trucks. In other words, restrained air is very hard. It just doesn't feel that way because we are usually exposed to unrestrained, free air.
Now, in my previous talk, I wrote about an example of sound damping, losing a sense of openness and atrophy as soon as the top cover is tightened. This does not seem to be an electrical shielding effect, since the tendency is the same even if the top cover material is made of wood, for example. But we are forgetting the presence of trapped air. When air is trapped, it becomes as hard as rubber and damps the board and all the components on the board. Lots of small holes are not going to be effective in releasing air. This is because air has a strong viscosity. And air, like rubber, holds back fast movement more forcefully.
The top cover of SOULNOTE is not fixed; it is placed on the body by three spikes. Of course, they are hooked to prevent them from coming off. If it were easy to come off, we would not be able to sell it as a product. As a result, it has the effect of loosening the air restraints inside. I thought about how I could provide you with the sound with the top cover open. After much thought and trial and error, I finally succeeded in creating that top cover that rattles when you push on it. Of course, it would be better if the top cover were somewhat lighter, but this would cause another problem: resonance. We solved this problem by joining two types of boards at three points to form a composite material top panel, and at the same time succeeded in releasing the air inside. This effect can be confirmed by experimenting with placing weights on the top cover. When weight is placed on the top cover, the sound instantly loses its openness and becomes boring and ordinary high-end audio sound.
These days, audio racks surrounded by boards like a bookcase are becoming less common, and racks with four pillars supporting the boards are the norm. I think this is because of the better sound. In fact, the sound is good. There are also racks with holes in the boards. The manufacturer explains this as to adjust the vibration mode of the board, but I think it is another effect.
SOULNOTE's RAR series audio racks also have holes in the boards. The reason for this is to escape the effects of air damps. When audio equipment is enclosed by boards, it instantly sounds cramped. On the other hand, if the audio rack is open, the sound will be liberated and the sound field will be expanded. SOULNOTE products, which release the air inside the product, require an open audio rack.
In the next article, I will write about resonance, a mechanical element that kills sound along with damping.
SOULNOTE Chief Designer Kato's
Design Philosophy Series Part 11
In this issue, we will discuss "resonance" of the chassis, which, along with damping, has a negative impact on sound.
Every object has its own vibration. Therefore, if you strike an object, it will make a unique sound. This is inevitable, but if the sharpness of the natural vibration, or the strength of resonance (Q value), is high, a strong and long-lasting sound like "kahn" or "keening" will be produced. This affects the sound and makes it sound habitual. Therefore, strong resonance should be avoided.
Similar to chassis, the sound of an electrical component can be predicted by the sound it makes when it is struck. For example, a film capacitor that makes a sharp and sharp sound when tapped also has a harsh sound quality. We feel that physical characteristics have a stronger influence on sound quality than electrical characteristics.
To suppress strong resonance, damping materials such as rubber are generally used. This will certainly suppress resonance, but at the same time, the sound will be damped and ruined. In other words, damping degrades the sound more than resonance. This will be discussed in more detail later.
Now, there is a simple way to reduce the intensity of resonance while avoiding damping. Loosen the structure. By doing so, the overall strength of the chassis is reduced, and the strength of the resonance is reduced. Loosening the structure also has the advantage of not transmitting the resonance of one member to another. It is easy to understand that the strength of resonance is reduced by imagining a guitar with relaxed strings.
In the previous issue, I mentioned that as soon as the top cover is secured tightly with screws, the sound is ruined. We discussed air damping as the cause of this problem. Another possible cause is that the resonance of the entire chassis becomes stronger. This is because fixing the top cover creates a monocoque structure and increases the strength of the entire chassis. If tightening the top cover not only makes the chassis less open, but also makes the sound harder, this is the cause.
SOULNOTE's unfixed top cover is an idea to prevent air damping, to suppress the strength of the resonance of the entire chassis, and to prevent the resonance of the top cover from propagating to the chassis.
Also, total aluminum CHASSIS looks gorgeous, but is prone to strong resonance.SOULNOTE's chassis is made of an optimal combination of aluminum and steel plates. The strength of the joints is also kept to a minimum to control resonance.
Even so, the chassis still vibrates. It receives sound pressure and vibration from the power transformer. It is very important not to transmit this to the printed circuit board. This is because most of the electrical components are mounted on the printed circuit board and vibrate together with the printed circuit board.
However, it should not be floated by rubber, etc., because it will be affected by damping.
In SOULNOTE, the printed circuit board is supported at three points and not fixed. It is not fixed, but only placed on three pillars without stress, in order not to transmit the vibration of the chassis to the PCB and to avoid the strong resonance of the PCB itself.
Also, the terminals are not fixed in order to isolate the vibration of the connecting cables. Furthermore, the damping effect of the chassis due to the weight of the connecting cable is also reduced.
Both are problems that should be avoided, but they have different effects on sound. Resonance is primarily a frequency-axis problem, while damping is a time-axis problem. In other words, resonance creates peaks at certain frequencies but has little effect on velocity on the time axis. On the other hand, damping is a delay on the time axis that blurs the sound. I think the emphasis in audio up to now has been on suppressing resonance. However, I consider damping to be just as much or more problematic. This is because I believe that human hearing is highly sensitive to time axis. This is why SOULNOTE's design philosophy is completely different from that of other companies.
In my next entry, I will write about the criteria for sound quality in product development. This is the most important story that relates to everything I have said so far. When I develop SOULNOTE products, I listen and judge everything. However, I do not create any sound. I will write about the meaning of this.
Non-fixed top cover with hybrid construction↓
SOULNOTE Chief Designer Kato's
Design Philosophy Series Part 12
In this issue, we talk about the most important aspect of SOULNOTE's design. This is the final installment of the series.
Previously, I have explained that there is a trade-off between static performance, which can be expressed by measurements, and dynamic performance, which is related to time axis and difficult to express by measurements. We have also explained why the structure of the chassis affects the sound. This is another element of dynamic performance that can only be judged by listening.
So, to what extent should Static performance be improved? My answer to this is as follows.
As long as there are no problems when listening to music, it is OK.
The most extreme example is the residual noise of a phono equalizer. If the residual noise is lower than the scratch noise of a cartridge tracing a vinyl record, we judge that there is no problem, and everything else is done in pursuit of dynamic performance. In other words, in the end, we just listen and judge.
Finally, we also take measurements. This is to detect manufacturing errors when mass-producing at the factory. When we design, we dare not measure. I also have a feeling that I want to improve the catalog specifications as much as possible for sales purposes. That is why I don't dare to measure in order to prevent any strange bias in my judgment.
I, too, thought that an oversampling digital filter was absolutely necessary. 5 years ago, during the development of a DA converter. I was experimenting with different settings, and suddenly the sound was so much better. It wasn't until later that I looked at the waveform, and that's when I discovered that the waveform was stair-stepping. It just so happened that I had the wrong setting and oversampling was turned off. If I had seen the waveform first, I would have corrected it immediately and would not have heard the NOS sound. And SOULNOTE would never have done NOS. Thanks to listening to the sound without measuring, NOS was born.
A sound source is a vinyl record, CD or file source. SOULNOTE respects them as much as possible. SOULNOTE is committed to respecting the work of art to the fullest, and we aim to bring out the best in each and every one of them without altering them.
In the development of audio equipment, the question is: "How does SOULNOTE create its sound? I don't make sound! I answer. This is because we believe that audio equipment should not create sound. That is why all sound sources are inherently wonderful. But only if you can get all the information recorded in the sound source! The reason for this is that
Conventional designs that emphasize Static performance will result in a boring sound with no sense of freshness as it is. There is no doubt in my experience. As I wrote before, I have been convinced of this since I was a student. So it is usually necessary to improve that boring sound later by changing components and so on. This is the true nature of sound construction. However, if you prioritize dynamic performance and design while listening to music, there is no need for sound construction. All you need to do is to carefully remove bottlenecks. Then, the balance will be finally adjusted at a high level, and a wonderful sound will be achieved. Imagine a river that has been dammed up in several places. If you continue to remove the weirs little by little, the river will finally open up to its full potential, and the original flow of the river will be restored. This is the only work that SOULNOTE does.
By focusing on static performance, the time axis is ignored, and the sound that has lost its freshness cannot be regained, no matter how much you work on the sound later. You can't get back the time that you lost. There is no seasoning that can freshen up sushi that has lost its freshness.
In the culinary analogy, the sound source is the food, and the audio equipment should be the tableware used to enjoy the food. The dishes should not have holes in them or be coated with sugar. I think of sound construction as putting sugar or sauce on the dishes. Such tableware chooses the music and the speakers.
In the process of product development, I always listen to music while selecting components, circuits, and structures. I have said this many times before. Finally, I would like to explain my approach to sound quality in detail.
Sound quality can be examined for any speaker, as long as it is above a certain level. Also, as long as the sound source is straightforward, I can evaluate any sound source from any era and any genre. However, sound sources whose time axis has been destroyed by digital processing are excluded.
You don't know the original sound of the sound source, so why should you be able to examine it?" Don't you think? To that question, I answer with this. No, even the engineer who finished the sound source may not know the true sound of that source. The reason is that SOULNOTE is not used for playback."
The reason why we do not select sound sources or speakers for sound quality studies is because it is not a balanced approach. Balancing is the usual way to achieve a good sound by matching the sound to the speakers and the sound source.
I do not take a balanced approach. I only remove obstacles that rob the sound of its freshness or add habit to it. So the evaluation is "is it there or not? For example, a three-dimensional echo, a sensation that cuts through space and jumps into your heart, a pleasant feeling that makes you want to keep listening to it forever. We judge whether or not these things are present. For example, you can tell just by the feeling of the air just before the music starts. This kind of feeling cannot be created later with audio equipment. It is definitely present in the original sound source. That is the "soul in the sound source" that has never been extracted before.
This is not a skill that is unique to me. Anyone who is present can judge it. If we are faithful to the time axis, the difference in sound is obvious to everyone's ears. That is why my development time is very short. Because it is really easy.
All music is a work of art, a legacy of humanity. And even the souls of musicians who are no longer with us are indeed recorded in their masters. If there is no audio device to revive the soul, it will be lost forever. That must be avoided at all costs. To do so, we need to free ourselves from the curse of Fourier.
I am developing a device to revive the soul. And I feel happy to be in an environment where I can do that. But I still have more to do. Of course, I have not forgotten my promise to Mr. Nakazawa.