大数据、深度学习能够助力网络安全吗?

 

大数据与深度学习的结合也许是网络安全的最佳解决方案。...

1
大数据能保障网络安全吗?


原文:Big data will fix internet security eventually

作者:Roger A. Grimes
导读

网络安全在近年来有所加强,例如对用户信息的双重保护、增加权限设置等,有关网络安全的法律也在进一步制定实施。但是人们可能忽略了,大数据也许才是这一问题的终极解决方案。

I’ve always thought that improved computer security controls would “fix” the internet and stop persistent criminality - turns out it might be big data analytics instead.

I’ve long written that only a large-scale improvement of the internet’s authentication mechanisms (that is, pervasive identity) could significantly reduce crime. If everyone on the internet had a default, assured identity, attackers would have a much harder time committing and getting away with cybercrimes.

事实上,大数据对于维护网络安全的作用早已露出端倪。今天的反垃圾邮件机制(antispam)已经十分完善。尽管50%的原始邮件都是垃圾,但是真正到达用户邮箱的寥寥无几。而在5-10年前,邮箱里大部分都是垃圾信息。

We’ve seen some progress over the years, such as two-factor authentication and better access controls. The days are numbered for simple logon names and passwords. And though it takes time for defensive controls, warrants, and legal evidence to be collected, efforts on the part of law enforcement are resulting in a greater number of successful prosecutions.

Nothing could be further from the truth.As the internet matures, legitimate uses will prevail and criminality will shrink. You can bet the bank - or your bitcoins - on that. What I failed to anticipate in the past,however, is the huge role big data analytics would play in securing the internet, our corporate networks, and our personal devices. Big data security analytics might actually account for a bigger piece of the solution than stronger authentication.

The truth is, we’ve had big data security analytics for a while. For example, today’s antispam mechanisms work pretty well. Spam may still account for more than 50 percent of every email sent across the Internet, but very little of it reaches your inbox. Five to 10 years ago, most of what you saw in your inbox was spam.

Then vendors created not only better local email filters, but also began recognizing email patterns early to prevent spam from being delivered. An antispam solution might see the same email sent to hundreds of people or the same IP address issuing dozens of different emails very rapidly, triggering a filter.

Spammers responded by commandeering innocent people’s computers as spam relays and endeavoring to make every spam email unique - but big data analytics can see the hidden pattern.

另一个例子是反恶意软件(antimalware)技术。一些高级别的软件商已经在尝试破解精准的网络安全分析代码,以做到以下几点:

  • 监测命令控制型的恶意机器人,在计算机受侵入时做出警告。
  • 监控表面正常的网络流量,标记出异常值。
  • 跟踪持续作业的黑客。
  • 区分合法登陆和盗号攻击的情况。
  • 找出网络诈骗、网络犯罪以及使用恶意脚本的网站。
  • 判断出一项交易是否使用了顾客的信息。
  • 发现组织内部的数据滥用情况。
Another long-used analytic technique is antimalware heuristics. As viruses and other malware used sophisticated permutation engines to appear unique for each user, antimalware vendors started looking for bad behavior patterns during their regular scans. An unknown program exhibiting malware behavior (infecting other files, hiding during boot-up, and so on) gets ranked for each noticed behavior. After enough potentially malicious behaviors accrue, the antimalware vendor marks the program as malicious and assigns it a generic malware ID that most closely matches the behavior.

The top security software vendors are trying to crack the code of accurate, trustworthy computer security analytics.We’re collecting most of the data we need, but we must figure out what gives us the most accurate results -- and what data we’re missing. Our early attempts at big data security analytics include companies and services that do the following:

  • Monitor command-and-control centers for malicious bots and tell you when your computers connect to those sites, indicating compromise
  • Monitor legitimate-appearing network traffic to flag malicious, tunneled traffic
  • Track multiple advanced persistent threat gangs and their activities
  • Distinguish between legitimate logins and malicious pass-the-hash attacks
  • Detect phishing, fraud, and websites using malicious JavaScript redirection
  • Tell whether or not a transaction using your identity or financial information is legitimate
  • Identify insider data misuse
用大数据防护网络安全的道路看似很长,但是所有的基石已经在今天铺就。

We're definitely in the early phases of big data computer security analytics, as this CSO article explains.But the foundation of future security analytics is being laid today.
2
深度学习助力网络安全


原文:How much security can you turn over to AI?

作者:Mary Branscombe

【导读】

调查显示,非业内人士检测一项潜在的网络攻击或威胁需要花费100天。而目前,机器学习有望帮助人们更快地检测和处理这一问题。

It's not always easy to know when you're under attack, or when your security has already been breached. If you're capable of detecting a breach, you might find it in as few as 10 days, but survey after survey finds that breaches that are detected by someone outside the business typically take over 100 days to find.

Can machine learning help you detect attacks more quickly and deal with them faster?

深度认知(Deep Instinct)是第一家将机器学习应用到网络安全的公司,它采用深度学习来研究恶意软件的运作方式,使其可以实时地检测攻击,足以取代一个防火墙。
There are some ambitious projects using machine learning.Deep Instinct is trying to use deep learning to map how malware behaves, so its appliances can detect attacks in real time, reliably enough to replace a firewall. More realistically, perhaps, Splunk is adding machine learning to its log analysis system to use behavioral analytics to detect attacks and breaches.

"Most organizations lack visibility; if you can't see it, you can't protect it. We can detect outliers," explains Splunk's Matthias Maier. "We summarize similar users who have similar behavior and then we show that, and if there's an outlier who has always behaved similarly but is now behaving differently? That's an anomaly you want to look at."
Splunk(数据分析软件)也正在将机器学习加入到它强大的日志分析系统,使用行为分析学来防止信息泄露。Splunk可以分析有关用户、计算机、IP地址、数据文件、应用程序的一切异常值。除此以外,他们还在检测内部攻击者和拥有合法名片的外部入侵者方面做努力。
Splunk can analyze users, computers, IP addresses, data files and applications for unusual behavior, and you don't need to hire machine learning experts."We a lot of this right out of the box," says Maier. "Most organizations don't have the capability to develop this on their own." Early adopters include John Lewis and Armani's retail stores.

Just detecting anomalies can still leave you with a lot of data to look at. A large organization could see thousands of anomalies a day, so Splunk uses further analysis to keep that manageable. Maier expects the tool to surface five or 10 threats a day, in enough detail to make it clear what's happening (avoiding the problem where noisy or overly complex alerting systems are ignored when they find a real breach).

"We have the full picture on the ‘kill chain' [of the attack]. We provide a security organization with the information, from the compromise point -- when did the attacker come in, what was the initial attack vector, when did they expand in this environment, what other files or servers or user accounts did they connect to? -- and then the exfiltration phase when they were sending data out … From all these anomalies and individual data points, we create a full picture and present it in a way that every security analyst can understand."

You can also use the machine learning features in Splunk for more intelligent operations and monitoring, like having your web site alert you that it's going to need more bandwidth because demand is increasing before the load causes problems, extending the usual analysis options Splunk is known for. But on the security side, Maier says, "We're concentrating on providing full solutions: detecting insider fraud, or detecting external attacks with valid credentials."
微软的高级防御工具也使用了机器学习的方法:深入研究用户账户、网络流量、安全信息以及事件管理系统,然后描绘出这些内容的常态表现,做为行为分析的基础。微软将检测到的可疑行为展现在一个事件轴中( 如下图所示)。该系统甚至还能布下陷阱以迷惑攻击者。
(微软的攻击事件轴)


Microsoft's Advanced Threat Analytics tool (based on its Aorato acquisition) combines a similar machine learning approach - learning about entities like user accounts and devices from Active Directory, network traffic and your security information and event management (SIEM) systems,then profiling their normal behavior to perform behavioral analysis - but also detecting suspicious activities that it presents in an Attack Timeline, complete with recommendations for dealing with the issue.

"We analyze all the Active Directory data, all the natural traffic going in and out of your domain controllers," says Microsoft's Anders Vinberg. "You can fake a lot of things but not natural traffic. We build a graph of which devices you interact with, which resources you access. We start learning normal behavior and once we have learned that, we begin alerting you."The system also creates traps to mislead attackers.

ATA concentrates on three types of suspicious activities. The first are mistakes and misconfigurations that amount to security risks in your network. "These are security issues that make the life of an attacker much easier, like using plaintext passwords over the wire," says Vinberg. It can also detect common attacks in real time, including the Pass-the-Ticket and Pass-the-Hash attacks commonly used to move from one system in your network to another.

The third area is where the machine learning comes in. "We detect abnormal behavior. There is always new malware, there are always new attacks … but every one of them would show up as abnormal behavior, because the account would act differently in the network from the regular user behavior," he explains.

"We're using huge machine learning systems and world class techniques to protect all the identities at Microsoft,"points out Alex Weinart, from Microsoft's Identity, Security and Protection group. "That includes Azure Active Directory, the Microsoft account system and Skype. Because we have one of the largest mail systems in the world, we are heavily targeted. Every attack that happens will pass our door; they'll try it against Google but they'll try it against us as well."

类似于Deep Instinct,微软也研究攻击者使用的方法,“以子之矛,攻子之盾”。他们获取约130亿个登陆或注册的信息,每天约10TB,而这些全都是微软机器学习系统的“学习资料”,来应对任何一项实时的攻击。

Microsoft also gets to see the methods attackers are using. "We see where attacks are coming from at a very nuanced level, and what attacks are shaped like, in both the consumer and enterprise space," says Weinert. "The adaptability of the bad guys means that the things that mattered yesterday may not matter today. And no-one in the enterprise space has the volume we have [to learn from]."

That volume is tens of terabytes a day and 13 billion login transactions, which are fodder for Microsoft's machine learning systems to stay up to date on the latest attacks. A deluge of data is only part of what you need to build a system like this. According to Weinert, "a relatively sophisticated and well trained machine learning system takes years, and you also need some expert level human supervision to look as see if there is anything the system isn't catching."

一个精密和高度智能的机器学习系统并非一朝一夕可以练就,目前来看要花费数十年的时间。而且大量的机器学习系统只能检测到已发生的异常事项,不能做提前的防御,而微软正在努力做的,便是预先防御。他们每天都在研究最新的攻击方式,在系统产生新的代码并加入到前端服务器中,以保证“魔高一尺,道高一丈”。

That matters because this is about more than spotting patterns and warning you later. As Weinert points out, "the goal is protection, not remediation. A lot of machine learning systems detect what's happened. Our primary goal is to stop attacks getting through, so we're training our protections systems. Every day we learn the nuances of the newest attack patterns … and we use the system to generate code on our front end servers that scores everything that comes through." That score uses around a hundred factors, from browser user agent strings to the time of day.

At some point, you can expect the machine learning systems in Azure AD and ATA to start working together."Active Directory on premise is this incredible nexus for data collection and analysis because essentially every use of an app on premise ends up going through the directory somehow," points out Microsoft's Alex Simons. "Part of the vision is to take all the data we're collecting in the cloud and to marry it up with data we're collecting on premise, to bring those data source together."

Whether you look at on premise or cloud systems, it might be time to take machine learning security systems seriously; because bad as it is today, it's going to get even harder to stay ahead of the hackers. Weinert warns:  "We're now seeing that the criminals are starting to invest in machine learning systems themselves."



长按指纹识别图中二维码


    关注 浪潮规划咨询


微信扫一扫关注公众号

0 个评论

要回复文章请先登录注册